SlideShare a Scribd company logo
RUBY INTERNALS
    Use the source, Luke!




 TW: @burkelibbey GH: @burke
RUBY INTERNALS
    Use the source, Luke!




 TW: @burkelibbey GH: @burke
TOPICS
• Basic   object structure

• Class   inheritance

• Singleton   classes

• Module    inheritance

• Contexts
BASIC OBJECT STRUCTURE
struct RBasic {
                  VALUE flags;
                  VALUE klass;
              };


Every object in ruby has an instance of RBasic.
struct RBasic {
                   VALUE flags;
                   VALUE klass;
               };


flags stores information like whether the object is
           frozen, or tainted, or others.
struct RBasic {
             VALUE flags;
             VALUE klass;
         };


klass is a pointer to the parent class
          (or singleton class)
typedef uintptr_t VALUE;



VALUE is used like a void pointer in ruby C code.
struct RFloat {
    struct RBasic basic;
    double float_value;
};

      This is a float.
struct RFloat {
        struct RBasic basic;
        double float_value;
    };

Like everything else, it has an RBasic.
struct RFloat {
          struct RBasic basic;
          double float_value;
      };

...and also the actual floating point value.
brb c
#define ROBJECT_EMBED_LEN_MAX 3
struct RObject {
  struct RBasic basic;
  union {
    struct {
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;
    } heap;
    VALUE ary[ROBJECT_EMBED_LEN_MAX];
  } as;
};

         This is a generic Object.
#define ROBJECT_EMBED_LEN_MAX 3
struct RObject {
  struct RBasic basic;
  union {
    struct {
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;
    } heap;
    VALUE ary[ROBJECT_EMBED_LEN_MAX];
  } as;
};

   You can pretty much ignore this stuff.
#define ROBJECT_EMBED_LEN_MAX 3
  struct RObject {
    struct RBasic basic;
    union {
      struct {
        long numiv;
        VALUE *ivptr;
        struct st_table *iv_index_tbl;
      } heap;
      VALUE ary[ROBJECT_EMBED_LEN_MAX];
    } as;
  };

Again, it has an RBasic representing its class (klass)
            and internal attributes (flags).
#define ROBJECT_EMBED_LEN_MAX 3
struct RObject {
  struct RBasic basic;
  union {
    struct {
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;
    } heap;
    VALUE ary[ROBJECT_EMBED_LEN_MAX];
  } as;
};

       It also has instance variables.
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;




   ivptr points to an array of ivar values.
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;




Unsurprisingly, numiv is the number of ivars.
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;




    iv_index_tbl is essentially a hash of
         {name -> index into ivptr}
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;




 st_table is a C hashtable implementation.
 It’s also the underpinning for ruby hashes.
#define ROBJECT_EMBED_LEN_MAX 3
struct RObject {
  struct RBasic basic;
  union {
    struct {
      long numiv;
      VALUE *ivptr;
      struct st_table *iv_index_tbl;
    } heap;
    VALUE ary[ROBJECT_EMBED_LEN_MAX];
  } as;
};

            Back to the top
An Object has:


            • klass   (parent class)

            • flags    (frozen? tainted? etc.)

            • Instance    variables

            • Nothing     else.
brb c
We saw that Float has a distinct implementation
                from Object
String, Regexp, Array, Hash, File, Rational, Complex,
             Data, and Bignum do too.

      These are mostly for performance reasons.
brb c
Class is the other exception.


Not just for performance. It has a lot of extra behaviour.
struct RClass {
    struct RBasic basic;
    rb_classext_t *ptr;
    struct st_table *m_tbl;
    struct st_table *iv_index_tbl;
};
struct RClass {
       struct RBasic basic;
       rb_classext_t *ptr;
       struct st_table *m_tbl;
       struct st_table *iv_index_tbl;
   };



A class has attributes (flags) and a class (klass).
struct RClass {
      struct RBasic basic;
      rb_classext_t *ptr;
      struct st_table *m_tbl;
      struct st_table *iv_index_tbl;
  };



rb_classext_t stores more class-specific info
struct RClass {
    struct RBasic basic;
    rb_classext_t *ptr;
    struct st_table *m_tbl;
    struct st_table *iv_index_tbl;
};



m_tbl is a hash of methods. Think of it as:
        {name -> method body}
struct RClass {
        struct RBasic basic;
        rb_classext_t *ptr;
        struct st_table *m_tbl;
        struct st_table *iv_index_tbl;
    };



Just like iv_index_tbl on RObject, except the rest
   of the ivar storage is done in rb_classext_t.
struct rb_classext_struct {
    VALUE super;
    struct st_table *iv_tbl;
    struct st_table *const_tbl;
};
typedef struct rb_classext_struct 
   rb_classext_t;



 This is the extended class information.
struct rb_classext_struct {
    VALUE super;
    struct st_table *iv_tbl;
    struct st_table *const_tbl;
};
typedef struct rb_classext_struct 
   rb_classext_t;



‘super’ is a pointer to the class’s superclass.
struct rb_classext_struct {
    VALUE super;
    struct st_table *iv_tbl;
    struct st_table *const_tbl;
};
typedef struct rb_classext_struct 
   rb_classext_t;



iv_tbl is a hash of {ivar name -> ivar value}
struct rb_classext_struct {
    VALUE super;
    struct st_table *iv_tbl;
    struct st_table *const_tbl;
};
typedef struct rb_classext_struct 
   rb_classext_t;



 similarly, const_tbl stores constants as
      {const name -> const value}
struct RClass {
       VALUE flags; //   attributes
       VALUE klass; //   parent class (often Class)
       VALUE super; //   superclass (often Object)
       struct st_table   *iv_tbl;       // ivars
       struct st_table   *const_tbl;    // constants
       struct st_table   *m_tbl;        // methods
       struct st_table   *iv_index_tbl; // ivars
   };




An incorrect but helpful simplification of RClass.
brb c
CLASS INHERITANCE
Let’s look at an example class and its RClass
class Oban < Scotch
  AGE = 14
  @tasty = true
  def tasty
    Oban.instance_variable_get("@tasty")
  end
end
class Oban < Scotch
  AGE = 14
  @tasty = true
  def tasty
    Oban.instance_variable_get("@tasty")
  end
end

 basic.klass               Class
 ptr->super               Scotch
   iv_tbl           {“@tasty” => true}
  const_tbl           {“AGE” => 14}
   m_tbl              {“tasty” => ...}
Another:

   class Animal ; end

class Dog < Animal ; end
Ruby Internals
Ruby Internals
Let’s look at another class:
class Scotch < Liquor
  def tasty?
    true
  end
end
class Scotch < Liquor
             def tasty?
               true
             end
           end


This lets us call: Scotch.new.tasty?
class Scotch < Liquor
               def tasty?
                 true
               end
             end


This lets us call: Scotch.new.tasty?
And puts {“tasty?” -> ...} into the m_tbl
class Scotch < Liquor
       def tasty?
         true
       end
     end


What if we wanted ‘tasty?’ to be a class
             method?
It clearly works, but how does ruby know it’s
               a class method?


        class Scotch < Liquor
          def self.tasty?
            true
          end
        end




          There’s only one m_tbl.
SINGLETON CLASSES
When you define a method with
“def self.(name)”, you create a singleton class.
BASIC OBJECT STRUCTURE
BASIC OBJECT STRUCTURE
Ruby Internals
‘self ’ is not everything it appears to be.
class Foo
               # self == Foo
               def bar
               end
             end

‘bar’ is defined as an instance method of ‘Foo’
class Foo
          def bar
          end
          def self.bar
          end
        end

These should be the same, right?
class Foo
        def bar
        end
        def self.bar
        end
      end

...because these are the same:
      my_method
      self.my_method
class Foo
                def bar
                end
                def self.bar
                end
              end

Invocations use ‘self ’ if no receiver is given.
         Don’t definitions? O_o
NOPE.
def foo Defines on the default definee


def target.bar Defines on target.singleton_class
ie: There’s a second, hidden context.




        default_definee
default_definee
                  Target for method definitions
                  with no target


          self
                  Receiver for method
                  invocations with no receiver
No easy way to reference the default definee.
eval "def _d;end"
y = method(:_d).owner rescue instance_method(:_d).owner
eval "undef _d"
y




             This is ugly, but it works.
DEFINEE = 'eval "def _d;end";y = method(:_d).owner rescue
instance_method(:_d).owner;eval "undef _d";y'

class Foo
  puts eval DEFINEE #=> Foo
end



              Really ugly. Really works.
self != default definee
Changes self?       Changes definee?

     class C                C                     C

  C.class_eval              C                     C

C.instance_eval             C             C.singleton_class

obj.instance_eval          obj            obj.singleton_class

 (in C) def foo            obj            obj.singleton_class

 obj.send :eval            obj             NO CHANGE

  class << obj      obj.singleton_class   obj.singleton_class

More Related Content

Ruby Internals

  • 1. RUBY INTERNALS Use the source, Luke! TW: @burkelibbey GH: @burke
  • 2. RUBY INTERNALS Use the source, Luke! TW: @burkelibbey GH: @burke
  • 3. TOPICS • Basic object structure • Class inheritance • Singleton classes • Module inheritance • Contexts
  • 5. struct RBasic {     VALUE flags;     VALUE klass; }; Every object in ruby has an instance of RBasic.
  • 6. struct RBasic {     VALUE flags;     VALUE klass; }; flags stores information like whether the object is frozen, or tainted, or others.
  • 7. struct RBasic {     VALUE flags;     VALUE klass; }; klass is a pointer to the parent class (or singleton class)
  • 8. typedef uintptr_t VALUE; VALUE is used like a void pointer in ruby C code.
  • 9. struct RFloat {     struct RBasic basic;     double float_value; }; This is a float.
  • 10. struct RFloat {     struct RBasic basic;     double float_value; }; Like everything else, it has an RBasic.
  • 11. struct RFloat {     struct RBasic basic;     double float_value; }; ...and also the actual floating point value.
  • 12. brb c
  • 13. #define ROBJECT_EMBED_LEN_MAX 3 struct RObject {   struct RBasic basic;   union {     struct {       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl;     } heap;     VALUE ary[ROBJECT_EMBED_LEN_MAX];   } as; }; This is a generic Object.
  • 14. #define ROBJECT_EMBED_LEN_MAX 3 struct RObject {   struct RBasic basic;   union {     struct {       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl;     } heap;     VALUE ary[ROBJECT_EMBED_LEN_MAX];   } as; }; You can pretty much ignore this stuff.
  • 15. #define ROBJECT_EMBED_LEN_MAX 3 struct RObject {   struct RBasic basic;   union {     struct {       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl;     } heap;     VALUE ary[ROBJECT_EMBED_LEN_MAX];   } as; }; Again, it has an RBasic representing its class (klass) and internal attributes (flags).
  • 16. #define ROBJECT_EMBED_LEN_MAX 3 struct RObject {   struct RBasic basic;   union {     struct {       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl;     } heap;     VALUE ary[ROBJECT_EMBED_LEN_MAX];   } as; }; It also has instance variables.
  • 17.       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl; ivptr points to an array of ivar values.
  • 18.       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl; Unsurprisingly, numiv is the number of ivars.
  • 19.       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl; iv_index_tbl is essentially a hash of {name -> index into ivptr}
  • 20.       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl; st_table is a C hashtable implementation. It’s also the underpinning for ruby hashes.
  • 21. #define ROBJECT_EMBED_LEN_MAX 3 struct RObject {   struct RBasic basic;   union {     struct {       long numiv;       VALUE *ivptr;       struct st_table *iv_index_tbl;     } heap;     VALUE ary[ROBJECT_EMBED_LEN_MAX];   } as; }; Back to the top
  • 22. An Object has: • klass (parent class) • flags (frozen? tainted? etc.) • Instance variables • Nothing else.
  • 23. brb c
  • 24. We saw that Float has a distinct implementation from Object
  • 25. String, Regexp, Array, Hash, File, Rational, Complex, Data, and Bignum do too. These are mostly for performance reasons.
  • 26. brb c
  • 27. Class is the other exception. Not just for performance. It has a lot of extra behaviour.
  • 28. struct RClass {     struct RBasic basic;     rb_classext_t *ptr;     struct st_table *m_tbl;     struct st_table *iv_index_tbl; };
  • 29. struct RClass {     struct RBasic basic;     rb_classext_t *ptr;     struct st_table *m_tbl;     struct st_table *iv_index_tbl; }; A class has attributes (flags) and a class (klass).
  • 30. struct RClass {     struct RBasic basic;     rb_classext_t *ptr;     struct st_table *m_tbl;     struct st_table *iv_index_tbl; }; rb_classext_t stores more class-specific info
  • 31. struct RClass {     struct RBasic basic;     rb_classext_t *ptr;     struct st_table *m_tbl;     struct st_table *iv_index_tbl; }; m_tbl is a hash of methods. Think of it as: {name -> method body}
  • 32. struct RClass {     struct RBasic basic;     rb_classext_t *ptr;     struct st_table *m_tbl;     struct st_table *iv_index_tbl; }; Just like iv_index_tbl on RObject, except the rest of the ivar storage is done in rb_classext_t.
  • 33. struct rb_classext_struct {     VALUE super;     struct st_table *iv_tbl;     struct st_table *const_tbl; }; typedef struct rb_classext_struct rb_classext_t; This is the extended class information.
  • 34. struct rb_classext_struct {     VALUE super;     struct st_table *iv_tbl;     struct st_table *const_tbl; }; typedef struct rb_classext_struct rb_classext_t; ‘super’ is a pointer to the class’s superclass.
  • 35. struct rb_classext_struct {     VALUE super;     struct st_table *iv_tbl;     struct st_table *const_tbl; }; typedef struct rb_classext_struct rb_classext_t; iv_tbl is a hash of {ivar name -> ivar value}
  • 36. struct rb_classext_struct {     VALUE super;     struct st_table *iv_tbl;     struct st_table *const_tbl; }; typedef struct rb_classext_struct rb_classext_t; similarly, const_tbl stores constants as {const name -> const value}
  • 37. struct RClass {     VALUE flags; // attributes     VALUE klass; // parent class (often Class)     VALUE super; // superclass (often Object)     struct st_table *iv_tbl; // ivars     struct st_table *const_tbl; // constants     struct st_table *m_tbl; // methods     struct st_table *iv_index_tbl; // ivars }; An incorrect but helpful simplification of RClass.
  • 38. brb c
  • 40. Let’s look at an example class and its RClass
  • 41. class Oban < Scotch   AGE = 14   @tasty = true   def tasty     Oban.instance_variable_get("@tasty")   end end
  • 42. class Oban < Scotch   AGE = 14   @tasty = true   def tasty     Oban.instance_variable_get("@tasty")   end end basic.klass Class ptr->super Scotch iv_tbl {“@tasty” => true} const_tbl {“AGE” => 14} m_tbl {“tasty” => ...}
  • 43. Another: class Animal ; end class Dog < Animal ; end
  • 46. Let’s look at another class:
  • 47. class Scotch < Liquor   def tasty?     true   end end
  • 48. class Scotch < Liquor   def tasty?     true   end end This lets us call: Scotch.new.tasty?
  • 49. class Scotch < Liquor   def tasty?     true   end end This lets us call: Scotch.new.tasty? And puts {“tasty?” -> ...} into the m_tbl
  • 50. class Scotch < Liquor   def tasty?     true   end end What if we wanted ‘tasty?’ to be a class method?
  • 51. It clearly works, but how does ruby know it’s a class method? class Scotch < Liquor   def self.tasty?     true   end end There’s only one m_tbl.
  • 53. When you define a method with “def self.(name)”, you create a singleton class.
  • 57. ‘self ’ is not everything it appears to be.
  • 58. class Foo   # self == Foo   def bar   end end ‘bar’ is defined as an instance method of ‘Foo’
  • 59. class Foo   def bar   end   def self.bar   end end These should be the same, right?
  • 60. class Foo   def bar   end   def self.bar   end end ...because these are the same: my_method self.my_method
  • 61. class Foo   def bar   end   def self.bar   end end Invocations use ‘self ’ if no receiver is given. Don’t definitions? O_o
  • 62. NOPE.
  • 63. def foo Defines on the default definee def target.bar Defines on target.singleton_class
  • 64. ie: There’s a second, hidden context. default_definee
  • 65. default_definee Target for method definitions with no target self Receiver for method invocations with no receiver
  • 66. No easy way to reference the default definee.
  • 67. eval "def _d;end" y = method(:_d).owner rescue instance_method(:_d).owner eval "undef _d" y This is ugly, but it works.
  • 68. DEFINEE = 'eval "def _d;end";y = method(:_d).owner rescue instance_method(:_d).owner;eval "undef _d";y' class Foo   puts eval DEFINEE #=> Foo end Really ugly. Really works.
  • 69. self != default definee
  • 70. Changes self? Changes definee? class C C C C.class_eval C C C.instance_eval C C.singleton_class obj.instance_eval obj obj.singleton_class (in C) def foo obj obj.singleton_class obj.send :eval obj NO CHANGE class << obj obj.singleton_class obj.singleton_class

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n