Someone had posted a blog post last month but it was removed (I don’t know why), but I had spent lot of time to verify the result of ChatGPT, so I think it’s useful to post my experience (again).

ChatGPT was queried to converting this algorithm to group anagrams to ABAP:

class Solution {
    public List<List<String>> groupAnagrams(String[] strs) {
        if (strs.length == 0) return new ArrayList();
        Map<String, List> ans = new HashMap<String, List>();
        int[] count = new int[26];
        for (String s : strs) {
            Arrays.fill(count, 0);
            for (char c : s.toCharArray()) count[c - 'a']++;

            StringBuilder sb = new StringBuilder("");
            for (int i = 0; i < 26; i++) {
                sb.append('#');
                sb.append(count[i]);
            }
            String key = sb.toString();
            if (!ans.containsKey(key)) ans.put(key, new ArrayList());
            ans.get(key).add(s);
        }
        return new ArrayList(ans.values());
    }
}

 

This was the code generated by ChatGPT:

CLASS solution DEFINITION.
  PUBLIC SECTION.
    METHODS: group_anagrams IMPORTING strs TYPE string_table
                       RETURNING VALUE(rv_anagrams) TYPE t_t_string_table.
ENDCLASS.

CLASS solution IMPLEMENTATION.
  METHOD group_anagrams.
    DATA: ans TYPE t_t_string_table,
          ca TYPE string,
          key TYPE string.
    IF lines( strs ) = 0.
      rv_anagrams = VALUE #( ).
      RETURN.
    ENDIF.
    CREATE DATA ans TYPE HASHED TABLE OF t_string_table
                      WITH UNIQUE KEY string.
    LOOP AT strs INTO ca.
      SORT ca BY character.
      key = ca.
      IF NOT ans IS INITIAL.
        INSERT VALUE #( ca ) INTO TABLE ans-values INDEX key.
      ELSE.
        INSERT VALUE #( ( ca ) ) INTO TABLE ans INDEX key.
      ENDIF.
    ENDLOOP.
    rv_anagrams = ans-values.
  ENDMETHOD.
ENDCLASS.

 

Isn’t it nice?

It’s impressive, but it doesn’t compile and the algorithm is plain wrong, lots of approximations. Maybe soon we’ll have AI connected to Java and ABAP knowledge base, but for now we get something apparently good, but needs a lot of rework. Let me explain.

 

I tried to complete the code to make it work. Before that, I created this ABAP test class which corresponds to the expected result of the algorithm (of the Java code above) and then I converted the algorithm from the Java code line by line, and made sure that the test succeeds:

CLASS solution DEFINITION.
  PUBLIC SECTION.
    TYPES: t_t_string_table TYPE STANDARD TABLE OF string_table WITH DEFAULT KEY.
    METHODS: group_anagrams IMPORTING strs               TYPE string_table
                            RETURNING VALUE(rv_anagrams) TYPE t_t_string_table.
ENDCLASS.

CLASS solution IMPLEMENTATION.
  METHOD group_anagrams.
    TYPES: ty_count TYPE STANDARD TABLE OF i WITH EMPTY KEY,
           BEGIN OF ty_ans_line,
             mapkey   TYPE string,
             mapvalue TYPE string_table,
           END OF ty_ans_line,
           ty_ans TYPE HASHED TABLE OF ty_ans_line WITH UNIQUE KEY mapkey.
    IF lines( strs ) = 0. rv_anagrams = VALUE #( ). RETURN. ENDIF.
    DATA(ans) = VALUE ty_ans( ).
    DATA(count) = VALUE ty_count( FOR x = 0 WHILE x < 26 ( ) ).
    LOOP AT strs INTO DATA(s).
      DO lines( count ) TIMES. count[ sy-index ] = 0. ENDDO.
      LOOP AT VALUE string_table( FOR off = 0 WHILE off < strlen( s ) ( substring( val = s off = off len = 1 ) ) ) INTO DATA(c).
        DATA(index) = cl_abap_conv_out_ce=>uccpi( c ) - cl_abap_conv_out_ce=>uccpi( 'a' ) + 1.
        count[ index ] = count[ index ] + 1.
      ENDLOOP.
      DATA(sb) = VALUE string_table( ).
      DO 26 TIMES.
        APPEND `#` TO sb.
        APPEND |{ count[ sy-index ] }| TO sb.
      ENDDO.
      DATA(key) = concat_lines_of( table = sb ).
      IF NOT line_exists( ans[ mapkey = key ] ). INSERT VALUE #( mapkey = key ) INTO TABLE ans. ENDIF.
      ASSIGN ans[ mapkey = key ]-mapvalue TO FIELD-SYMBOL(<mapvalue>).
      INSERT s INTO TABLE <mapvalue>.
    ENDLOOP.
    rv_anagrams = VALUE #( FOR <ans_line> IN ans ( <ans_line>-mapvalue ) ).
  ENDMETHOD.
ENDCLASS.



CLASS ltc_main DEFINITION FOR TESTING DURATION SHORT RISK LEVEL HARMLESS.
  PRIVATE SECTION.
    METHODS test FOR TESTING.
ENDCLASS.
CLASS ltc_main IMPLEMENTATION.
  METHOD test.
    DATA(words) = VALUE string_table( 
        ( `are` ) ( `bat` ) ( `ear` ) ( `code` ) ( `tab` ) ( `era` ) ).
    DATA(answer) = NEW solution( )->group_anagrams( words ).
    cl_abap_unit_assert=>assert_equals( act = answer exp = VALUE solution=>t_t_string_table(
        ( VALUE #( ( `are` ) ( `ear` ) ( `era` ) ) )
        ( VALUE #( ( `bat` ) ( `tab` ) ) )
        ( VALUE #( ( `code` ) ) ) ) ).
  ENDMETHOD.
ENDCLASS.

 

Then I wanted to see how to correct the version of ChatGPT with the fewest changes possible. That’s the result and below is the screenshot of Diff:

CLASS solution DEFINITION.
  PUBLIC SECTION.
    TYPES: t_t_string_table TYPE STANDARD TABLE OF string_table WITH DEFAULT KEY.
    METHODS: group_anagrams IMPORTING strs TYPE string_table
                       RETURNING VALUE(rv_anagrams) TYPE t_t_string_table.
ENDCLASS.

CLASS solution IMPLEMENTATION.
  METHOD group_anagrams.
    TYPES: BEGIN OF t_string_table,
             string   TYPE string,
             mapvalue TYPE string_table,
           END OF t_string_table,
           t_string_table_2 TYPE HASHED TABLE OF t_string_table WITH UNIQUE KEY string.
    DATA: ans TYPE REF TO t_string_table_2,
          key TYPE string.
    IF lines( strs ) = 0.
      rv_anagrams = VALUE #( ).
      RETURN.
    ENDIF.
    CREATE DATA ans TYPE HASHED TABLE OF t_string_table
                      WITH UNIQUE KEY string.
    LOOP AT strs INTO DATA(s).
      DATA(ca) = VALUE string_table( FOR off = 0 WHILE off < strlen( s ) ( substring( val = s off = off len = 1 ) ) ).
      SORT ca BY table_line.
      key = concat_lines_of( table = ca ).
      IF NOT line_exists( ans->*[ string = key ] ).
        INSERT VALUE #( string = key ) INTO TABLE ans->*.
      ENDIF.
      ASSIGN ans->*[ string = key ]-mapvalue TO FIELD-SYMBOL(<mapvalue>).
      INSERT s INTO TABLE <mapvalue>.
    ENDLOOP.
    rv_anagrams = VALUE #( FOR <ans_line> IN ans->* ( <ans_line>-mapvalue ) ).
  ENDMETHOD.
ENDCLASS.

Diff%20between%20ChatGPT%20code%20and%20code%20fixed%20with%20minimal%20changes

Diff between ChatGPT code and code fixed with minimal changes

What it took to me to do these fixes:

  • I had to understand the algorithm (from the java code for instance).
  • It was impossible for me to FIRST understand the crazy code generated by ChatGPT.
  • After that, I had to understand which parts of ChatGPT code correspond to which part of the algorithm, and I could find the minimal changes.

Conclusion: it took me 3x more time to find how to fix the program with minimal changes rather than writing the algorithm from scratch.

(3x is an estimation by me)

But I guess ChatGPT could be trained with ABAP code equivalent to Java code and maybe that would result in more reusable code in the future…

Possibly, ChatGPT could be used just to generate simple logic. Let’s see if anybody can propose real use cases, and not just “oh look how the code looks good” without checking “what it takes to fix the code”.

 

Sara Sampaio

Sara Sampaio

Author Since: March 10, 2022

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x