Here’s the program:

rna_transcription.c

#include "rna_transcription.h"
#include <malloc.h>
#include <string.h>

static const char lookup[] = {
    ['A'] = 'U',
    ['C'] = 'G',
    ['G'] = 'C',
    ['T'] = 'A'
};

char *to_rna(const char *dna)
{
    if (!dna)
        return NULL;

    char *rna = calloc(strlen(dna) + 1, 1), *start_rna = rna;
    if (rna)
    {
        for (; *dna; dna++, rna++)
        {
            if (!(*rna = lookup[(int)*dna]))
            {
                free(rna);
                return NULL;
            }
        }
    }
    return start_rna;
}

rna_transcription.h

#ifndef RNA_TRANSCRIPTION_H
#define RNA_TRANSCRIPTION_H

char *to_rna(const char *dna);

#endif

I can’t help but wonder how much of a waste of space the array would be. Surely, using a map is better, right?

  • @GissaMittJobb
    link
    4
    edit-2
    26 days ago

    A few things come to mind:

    1. The array is probably fine. It’s not going to be particularly large and lookups are O(1)
    2. It’s a bit weird to me that you’re using the char pointers as indices. I would probably use actual indices and spend the miniscule amount of additional stack data to improve the clarity of the code
    3. You could save on some indentation by returning early instead of nesting the for-loop inside the first if-statement
    4. Is the call to the lookup-table really safe? Maybe checking that the token from RNA is within the bounds is the way to go?
    5. The only thing I would even remotely care about with regards to performance is the malloc, and that’s not that big of a deal anyway unless the length of dna is really large. Streaming the result or overwriting the presumably already malloc’d input would be the only thing I would touch, and only if I could prove that it improves performance in practice.
    6. (added in edit): if you can guarantee that the input is well formed, you can omit the bounds check and save some effort there.
    • @velox_vulnusOP
      link
      English
      2
      edit-2
      26 days ago

      That is not my program actually. Here’s what I’ve come up with:

      rna_transcription.c

      #include "rna_transcription.h"
      
      static char transcribe_nucleotide(char nucleotide) {
          switch (nucleotide) {
              case 'G':
                  return 'C';
              case 'C':
                  return 'G';
              case 'T':
                  return 'A';
              case 'A':
                  return 'U';
              default:
                  return nucleotide;
          }
      }
      
      char *to_rna(const char *dna) {
          size_t len = strlen(dna);
          char *rna = malloc((len + 1) * sizeof(char));
      
          for (size_t i = 0; i <= len; ++i) {
              rna[i] = transcribe_nucleotide(dna[i]);
          }
      
          return rna;
      }
      

      rna_transcription.h

      #ifndef RNA_TRANSCRIPTION_H
      #define RNA_TRANSCRIPTION_H
      
      #include <string.h>
      #include <stdlib.h>
      
      char *to_rna(const char *dna);
      
      #endif
      

      I could not find the equivalent of map in standard library, so that is why I was interested in the community solutions.

      • @GissaMittJobb
        link
        226 days ago

        A switch statement is probably a decent option here, yeah. You trade off a little bit of memory for what might be a few more instructions executing the switch statement, unless the compiler picks up on it and optimizes it. Maybe check godbolt for what gets generated in practice if you really care about it.