How to work with Ruby strings in C extensions ############################################# :category: Ruby :tags: ruby gems, c :date: 2016-04-05 01:40:00 +03:00 The Problem =========== That's about string types in C and Ruby. As you may know, C uses `null-terminated strings`_ while Ruby uses more sophisticated string type, therefore C strings cannot contain null byte while Ruby strings can. Many Ruby gems are written in C but what happens when you convert Ruby string to the C string? .. _null-terminated strings: https://en.wikipedia.org/wiki/Null-terminated_string .. TEASER_END Well, that depends. There are at least two ways in Ruby C API: RSTRING_PTR(VALUE) ------------------ .. code-block:: c strlcpy( our_c_string, RSTRING_PTR(our_ruby_string), RSTRING_LEN(our_ruby_string) + 1 // don't forget the terminating zero ); .. note:: Never use RSTRING_PTR without RSTRING_LEN. Ruby string buffers may lack null character at the end. `[*] `__ .. note:: This code previously used strncpy. I was wrong. Never use strncpy. `[*] `__ So what happens to strings containing nulls? They just get truncated. :code:`"HAHA I'M HAXXOR! \0 SOME CORRUPT DATA"` becomes :code:`"HAHA I'M HAXXOR! "` But there is another, better way. StringValueCStr(VALUE) ---------------------- .. code-block:: c our_c_string = StringValueCStr(our_ruby_string); Seems simple enough but if :code:`our_ruby_string` contains null, we will get an exception, the one we got from :code:`systemd-journal`: .. code-block:: text ArgumentError: string contains null byte Better but still may be not good enough. What to do then? ================ There are several options to make it right, depending on what you're trying to do. Wrapping C library that depends on C strings -------------------------------------------- If C library that you're wrapping heavily relies on C strings, then you have no choice. Just use :code:`StringValueCStr` and let it fail on incorrect C strings. This also happens when you use :code:`ffi` instead of writing an extension. :code:`ffi` always uses :code:`StringValueCStr` for string arguments. Wrapping C library that doesn't depend on C strings --------------------------------------------------- You may be lucky enough to find one. I got lucky when writing a wrapper for systemd-journal, because :code:`sd_journal_sendv()` uses :code:`iovec` structs as arguments, not strings. So just use it! Get buffer pointer and length. .. code-block:: c struct iovec* msgs = xcalloc(argc, sizeof(struct iovec)); for (int i = 0; i < argc; i++) { VALUE v = argv[i]; msgs[i].iov_base = RSTRING_PTR(v); msgs[i].iov_len = RSTRING_LEN(v); } int result = sd_journal_sendv(msgs, argc); However doing this way means that you must abandon :code:`ffi` and write a real extension. Wrapping a library without strings or writing something yourself ---------------------------------------------------------------- Then just avoid C strings. Use :code:`ffi` and do all string related things in Ruby. Or even use rice_ and write everything in C++. .. _rice: https://github.com/jasonroelofs/rice Conclusion ========== C strings are awful. Avoid them.