Wednesday 9 August 2017

ABAP Nested Loop Performance Test

During an ABAP performance discussion, we decided to run tests to decide on the best method for nested loops. The example was to have an outer loop of BKPF and an inner loop of BSEG.

Nesting two raw loops gives the worst performance in all cases; therefore I have left it out.

In our initial test using the source code shared below; here are the methods ordered by better performance:
  1. Hashed table: 491K
  2. Binary search: 546K
  3. Parallel cursor: 1.100K
  4. Sorted table: 1.688K
When BUKRS is CHAR4, GJAHR is CHAR4 (fixed: 2015), BELNR is CHAR10;
  1. Parallel cursor: 497K
  2. Hashed table: 554K
  3. Sorte table: 800K
  4. Binary search: 1.295K
When BUKRS is CHAR4 (fixed: 1000), GJAHR is CHAR4 (fixed: 2015), BELNR is CHAR10;
  1. Sorted table: 250K
  2. Parallel cursor: 307K
  3. Hashed table: 381K
  4. Binary search: 773K
When BUKRS is CHAR40, GJAHR is CHAR40 (fixed: 2015), BELNR is CHAR40;
  1. Hashed table: 794K
  2. Parallel cursor: 1.030K
  3. Sorted table: 1.764K
  4. Binary search: 2.813K
When BUKRS is CHAR40 (fixed: 1000), GJAHR is CHAR40 (fixed: 2015), BELNR is CHAR40;
  1. Hashed table: 777K
  2. Sorted table: 998K
  3. Parallel cursor: 1.393K
  4. Binary search: 2.632K
Obviously, the best performance depends on the dataset. In case you can relate your dataset with one of the scenarios above, you can pick and use the best method.

For (most) cases where data is not predictable, I have calculated average runtime values for each method. The result is;
  1. Hashed table: ~600K
  2. Parallel cursor: ~900K
  3. Sorted table: ~1000K
  4. Binary search: ~1600K
As a conclusion, we can say that using a hashed table and avoiding the inner loop is the best overall method. The second best alternative is almost a tie between parallel cursor and sorted table – you can use whichever is possible. Using binary search to avoid the inner loop seems to be the worse overall method.

Personally; I would prefer sorted tables over parallel cursors because they are more readable.

Here is the original code used for measurement.

REPORT ztest003.

DATA:
  gt_bkpf  TYPE STANDARD TABLE OF bkpf,
  gt_bseg  TYPE STANDARD TABLE OF bseg,
  gv_begin TYPE i.

PARAMETERS:
  p_head TYPE i OBLIGATORY DEFAULT 50000,
  p_item TYPE i OBLIGATORY DEFAULT 10.

PERFORM main.

FORM loop_curs.

  DATA:
    lt_bkpf  TYPE STANDARD TABLE OF bkpf,
    lt_bseg  TYPE STANDARD TABLE OF bseg,
    lv_index TYPE i.

  lt_bkpf[] = gt_bkpf[].
  lt_bseg[] = gt_bseg[].

  SORT:
    lt_bkpf BY bukrs belnr gjahr,
    lt_bseg BY bukrs belnr gjahr.


  LOOP AT lt_bkpf ASSIGNING FIELD-SYMBOL(<ls_bkpf>).
    LOOP AT lt_bseg ASSIGNING FIELD-SYMBOL(<ls_bseg>) FROM lv_index.


      IF <ls_bseg>-bukrs NE <ls_bkpf>-bukrs OR
         <ls_bseg>-belnr NE <ls_bkpf>-belnr OR
         <ls_bseg>-gjahr NE <ls_bkpf>-gjahr.

        lv_index = sy-tabix.
        EXIT.

      ENDIF.

    ENDLOOP.
  ENDLOOP.

ENDFORM.

FORM loop_bina.

  DATA:
    lt_bkpf TYPE STANDARD TABLE OF bkpf,
    lt_bseg TYPE STANDARD TABLE OF bseg.

  lt_bkpf[] = gt_bkpf[].
  SORT lt_bkpf BY bukrs belnr gjahr.

  lt_bseg[] = gt_bseg.

  LOOP AT lt_bseg ASSIGNING FIELD-SYMBOL(<ls_bseg>).

    READ TABLE lt_bkpf ASSIGNING FIELD-SYMBOL(<ls_bkpf>)
      WITH KEY
        bukrs = <ls_bseg>-bukrs
        belnr = <ls_bseg>-belnr
        gjahr = <ls_bseg>-gjahr
      BINARY SEARCH.

  ENDLOOP.

ENDFORM.

FORM loop_hash.

  DATA:
    lt_bkpf TYPE HASHED TABLE OF bkpf
      WITH UNIQUE KEY primary_key
      COMPONENTS bukrs belnr gjahr,

    lt_bseg TYPE STANDARD TABLE OF bseg.

  lt_bkpf[] = gt_bkpf[].
  lt_bseg[] = gt_bseg.

  LOOP AT lt_bseg ASSIGNING FIELD-SYMBOL(<ls_bseg>).

    ASSIGN lt_bkpf[ KEY primary_key COMPONENTS
      bukrs = <ls_bseg>-bukrs
      belnr = <ls_bseg>-belnr
      gjahr = <ls_bseg>-gjahr
    ] TO FIELD-SYMBOL(<ls_bkpf>).

  ENDLOOP.

ENDFORM.

FORM loop_loop.

  DATA:
    lt_bkpf TYPE STANDARD TABLE OF bkpf,
    lt_bseg TYPE STANDARD TABLE OF bseg.

  lt_bkpf[] = gt_bkpf[].
  lt_bseg[] = gt_bseg[].

  LOOP AT lt_bkpf ASSIGNING FIELD-SYMBOL(<ls_bkpf>).

    LOOP AT lt_bseg ASSIGNING FIELD-SYMBOL(<ls_bseg>)
                    WHERE bukrs EQ <ls_bkpf>-bukrs
                      AND belnr EQ <ls_bkpf>-belnr
                      AND gjahr EQ <ls_bkpf>-gjahr.

    ENDLOOP.

  ENDLOOP.

ENDFORM.

FORM loop_sort.

  DATA:
    lt_bkpf TYPE STANDARD TABLE OF bkpf,
    lt_bseg TYPE SORTED TABLE OF bseg WITH NON-UNIQUE KEY bukrs belnr gjahr.

  lt_bkpf[] = gt_bkpf[].
  lt_bseg[] = gt_bseg[].

  LOOP AT lt_bkpf ASSIGNING FIELD-SYMBOL(<ls_bkpf>).

    LOOP AT lt_bseg ASSIGNING FIELD-SYMBOL(<ls_bseg>)
                    WHERE bukrs EQ <ls_bkpf>-bukrs
                      AND belnr EQ <ls_bkpf>-belnr
                      AND gjahr EQ <ls_bkpf>-gjahr.

    ENDLOOP.

  ENDLOOP.

ENDFORM.

FORM main.

  PERFORM:
    prepare_data,
    "start_chrono, loop_loop, stop_chrono USING 'Regular loop',
    start_chrono, loop_hash, stop_chrono USING 'Hashed table',
    start_chrono, loop_bina, stop_chrono USING 'Binary search',
    start_chrono, loop_sort, stop_chrono USING 'Sorted table',
    start_chrono, loop_curs, stop_chrono USING 'Parallel cursor'.

ENDFORM.

FORM prepare_data.

  DATA:
    lv_belnr TYPE bkpf-belnr,
    lv_buzei TYPE bseg-buzei.

  DO p_head TIMES.

    ADD 1 TO lv_belnr.

    APPEND VALUE #(
      bukrs = 1000
      belnr = lv_belnr
      gjahr = sy-datum+0(4)
    ) TO gt_bkpf.

    CLEAR lv_buzei.

    DO p_item TIMES.
      ADD 1 TO lv_buzei.

      APPEND VALUE #(
        bukrs = 1000
        belnr = lv_belnr
        gjahr = sy-datum+0(4)
        buzei = lv_buzei
      ) TO gt_bseg.

    ENDDO.

  ENDDO.

ENDFORM.

FORM start_chrono.
  GET RUN TIME FIELD gv_begin.
ENDFORM.

FORM stop_chrono USING iv_text.

  DATA:
    lv_diff TYPE i,
    lv_end  TYPE i.

  GET RUN TIME FIELD lv_end.
  lv_diff = lv_end - gv_begin.

  NEW-LINE.
  WRITE |{ iv_text } runtime: { lv_diff } |.

ENDFORM.

No comments:

Post a Comment