How to work with Ruby strings in C extensions
The Problem
That's about string types in C and Ruby. As you may know, C uses null-terminated strings while Ruby uses more sophisticated string type, therefore C strings cannot contain null byte while Ruby strings can. Many Ruby gems are written in C but what happens when you convert Ruby string to the C string?
Well, that depends. There are at least two ways in Ruby C API:
RSTRING_PTR(VALUE)
strlcpy( our_c_string, RSTRING_PTR(our_ruby_string), RSTRING_LEN(our_ruby_string) + 1 // don't forget the terminating zero );
So what happens to strings containing nulls? They just get truncated.
"HAHA I'M HAXXOR! \0 SOME CORRUPT DATA"
becomes "HAHA I'M HAXXOR! "
But there is another, better way.
StringValueCStr(VALUE)
our_c_string = StringValueCStr(our_ruby_string);
Seems simple enough but if our_ruby_string
contains null, we will get an exception, the one we got from systemd-journal
:
ArgumentError: string contains null byte
Better but still may be not good enough.
What to do then?
There are several options to make it right, depending on what you're trying to do.
Wrapping C library that depends on C strings
If C library that you're wrapping heavily relies on C strings, then you have no choice. Just use StringValueCStr
and let it fail on incorrect C strings.
This also happens when you use ffi
instead of writing an extension. ffi
always uses StringValueCStr
for string arguments.
Wrapping C library that doesn't depend on C strings
You may be lucky enough to find one. I got lucky when writing a wrapper for systemd-journal, because sd_journal_sendv()
uses iovec
structs as arguments, not strings. So just use it! Get buffer pointer and length.
struct iovec* msgs = xcalloc(argc, sizeof(struct iovec)); for (int i = 0; i < argc; i++) { VALUE v = argv[i]; msgs[i].iov_base = RSTRING_PTR(v); msgs[i].iov_len = RSTRING_LEN(v); } int result = sd_journal_sendv(msgs, argc);
However doing this way means that you must abandon ffi
and write a real extension.
Wrapping a library without strings or writing something yourself
Then just avoid C strings. Use ffi
and do all string related things in Ruby. Or even use rice and write everything in C++.
Conclusion
C strings are awful. Avoid them.
Comments
Comments powered by Disqus