V8-0.2.5源码阅读笔记

| categories notes 
tags V8 

// 静态类型校验
#define TYPE_CHECK(T, S)                              \
  while (false) {                                     \
    *(static_cast<T**>(0)) = static_cast<S*>(0);      \
  }

这里左边一定要用T**而非T*的原因是左值(lvalue),T*不符合语法。

  template <class S> bool operator==(Handle<S> that) {
    void** a = reinterpret_cast<void**>(**this);
    void** b = reinterpret_cast<void**>(*that);
    if (a == 0) return b == 0;
    if (b == 0) return false;
    return *a == *b;
  } 

详情见我在SO上的提问:http://stackoverflow.com/a/41537632/3758809

这个地方用void**是取出对象的头sizeof(void*)个字节,然后做比较。具体和V8的对象内存布局有关。为了垃圾回收,不能直接获取指针本身。

    FILE* file = fopen(file_name, "rb");
    if (file == NULL) {
        return 0;
    }
    
    fseek(file, 0, SEEK_END);
    int size = ftell(file);
    rewind(file);

    char* chars = new char[size+1];
    int read = -1;
    char[size] = '\0';
    for (int i = 0; i < size; i += read) { // hint: great idea!
        read = fread(chars[i], 1, size - i, file);
    }

一个读文件的技巧。简单有效的把整个文件内容读到内存中。

应该是一种标准用法。合理的利用了fread的返回值。

// ----------------------------------------------------------------------------
// BitField is a help template for encoding and decode bitfield with
// unsigned content.
template<class T, int shift, int size>
class BitField {
 public:
  // Tells whether the provided value fits into the bit field.
  static bool is_valid(T value) {
    return (static_cast<uint32_t>(value) & ~((1U << (size)) - 1)) == 0;
    // hint: value's top (32-size) bits must be zero.
  }

  // Returns a uint32_t mask of bit field.
  static uint32_t mask() {
    return (1U << (size + shift)) - (1U << shift);
    // hint: return-value's shift-to-size(low to high) bits be 1.
  }

  // Returns a uint32_t with the bit field value encoded.
  static uint32_t encode(T value) {
    ASSERT(is_valid(value));
    return static_cast<uint32_t>(value) << shift;
  }

  // Extracts the bit field from the value.
  static T decode(uint32_t value) {
    return static_cast<T>((value >> shift) & ((1U << (size)) - 1));
  }
};

不多解释。如下图

0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0
              |               |
          size+shift        shift
              -----------------
                      |
                    value 

页布局如下

0   4   8           256                          1024 (byte)
|   |   |            |                            |
+-------------------------------------------------+
|f1 |f2 |     RSet   |     Object Area            |
+-------------------------------------------------+

The first word(f1) of a page is an opaque page header that has the address of the next page and its ownership information. 
The second word(f2) may have the allocation top address of this page. 
The next 248 bytes(RSet) are remembered sets. 

Rset的大小为248-bits,每个bit对应Object区的一个32-bits指针,因为头256-bits(32*8)不需要RSet存储,所以248-bits(8192/32-8)足够表示剩下的地址。

kBitsPerPointer:32,v8-0.2.5只支持32位机器,因此指针大小sizeof(void*)永远等于4,kBitsPerPointer = kPointerSize(4) * kBitsPerByte(8)。
kPageSize: Page大小,8192-bits(2^13),即8KB。
kRSetEndOffset: RSet的结束位置,512(kPageSize/kBitsPerPointer)。
kRSetStartOffset: RSet的开始位置,8(kRSetEndOffset/kBitsPerPointer)。
kObjectStartOffset: Object区的开始位置。
kObjectAreaSize:Object区的大小,等于kMaxHeapObjectSize。

能用248bits表示当前页的所有指针的原因是,指针必须32-bits对齐。

kRSetStartOffset这里的计算方式我怀疑有问题,虽然8是正确的。

(注:v8-0.2.5只支持32位机器,因此指针大小sizeof(void*)永远等于4)

Smi范围为 -2^31 - 2^31-1。

这个范围内的数,对于正值,比特模式为00XXXXX。对于负值,比特模式为11XXXXX。存储整数(int->Smi)时,左移动一位;获取数值(Smi->int)时,右移一位。

利用右移位时的符号位拓展规则,可以在移位后不用去考虑符号位,因为总是和其真实值一致。合理的利用机器的特性可以高效的实现功能。

HandleScope通过blocks存储对Handle在栈上的引用。

class EXPORT HandleScope {
 public:
  HandleScope() : previous_(current_), is_closed_(false) {
    current_.extensions = 0;
  }

  ...
  ...

  class EXPORT Data {
   public:
    int extensions;
    void** next;
    void** limit;
    inline void Initialize() {
      extensions = -1;
      next = limit = NULL;
    }
  };

  static Data current_;
  const Data previous_;

  ...
  ...
};

class HandleScopeImplementer {
 public:
 
  ...
  ...

 private:
  List<void**> blocks;
  Object** spare;
  int call_depth;
  // Used as a stack to keep track of entered contexts.
  List<Handle<Object> > entered_contexts_;
  // Used as a stack to keep track of saved contexts.
  List<Handle<Object> > saved_contexts_;
  // Used as a stack to keep track of saved security contexts.
  List<Handle<Object> > saved_security_contexts_;
  bool ignore_out_of_memory;
  // This is only used for threading support.
  ImplementationUtilities::HandleScopeData handle_scope_data_;

  ...
  ...
};

static i::HandleScopeImplementer thread_local;

List<void**> HandleScopeImplementer::blocks;
为一堆存储空间的列表。
void** block;
其中每一项block为大小kHandleBlockSize(1022)的void指针数组。
int HandleScope::extensions;
为blocks上挂载的block的数量。
void** HandleScope::limit;
为blocks中最后一个block的最后一个元素+1的指针,即标志blocks空间的结尾。注意limit地址在可用地址空间之外。
void** HandleScope::next;
blocks上下一个可用的地址空间的指针。

CALL_HEAP_FUNCTION(Heap::LookupSymbol(string), String);

通过Heap::XXX来申请内存,然后用CALL_HEAP_FUNCTION来检查是否需要GC。

from http://www.jayconrod.com/posts/55/a-tour-of-v8-garbage-collection

New-space: Most objects are allocated here. New-space is small and is designed to be garbage collected very quickly, independent of other spaces.

Old-pointer-space: Contains most objects which may have pointers to other objects. Most objects are moved here after surviving in new-space for a while.

Old-data-space: Contains objects which just contain raw data (no pointers to other objects). Strings, boxed numbers, and arrays of unboxed doubles are moved here after surviving in new-space for a while.

Code-space: Code objects, which contain JITed instructions, are allocated here. This is the only space with executable memory (although Codes may be allocated in large-object-space, and those are executable, too).

Cell-space, property-cell-space and map-space: These spaces contain Cells, PropertyCells, and Maps, respectively. Each of these spaces contains objects which are all the same size and has some constraints on what kind of objects they point to, which simplifies collection.

Large-object-space: This space contains objects which are larger than the size limits of other spaces. Each object gets its own mmap'd region of memory. Large objects are never moved by the garbage collector.


enum AllocationSpace {
  NEW_SPACE,
  OLD_SPACE,
  CODE_SPACE,
  MAP_SPACE,
  LO_SPACE,
  FIRST_SPACE = NEW_SPACE,
  LAST_SPACE = LO_SPACE
};

#define AAA(V)      \
    V(A1, A2)       \
    V(B1, B2)       \
    V(C1, C2)

#define BBB(m, n)   \
    m * n;
  AAA(BBB)
#undef BBB

展开宏:

A1 * A2;
B1 * B2;
C1 * C2;

直接把BBB的内容套到V上即可。

// Returns true if x is a power of 2.  Does not work for zero.
template <typename T>
static inline bool IsPowerOf2(T x) {
  return (x & (x - 1)) == 0;
}

对象的内存布局

class Object BASE_EMBEDDED {
  // Layout description.
  static const int kSize = 0;  // Object does not take up any space.
}

class HeapObject: public Object {
  // Layout description.
  // First field in a heap object is map.
  static const int kMapOffset = Object::kSize; // 0
  static const int kSize = kMapOffset + kPointerSize; // sizeof(void*) == 4
}

class Map: public HeapObject {
  // Layout description.
  static const int kInstanceAttributesOffset = HeapObject::kSize; // 4
  static const int kPrototypeOffset = kInstanceAttributesOffset + kIntSize; // 8; sizeof(int) == 4
  static const int kConstructorOffset = kPrototypeOffset + kPointerSize; // 12
  static const int kInstanceDescriptorsOffset =
      kConstructorOffset + kPointerSize; // 16
  static const int kCodeCacheOffset = kInstanceDescriptorsOffset + kPointerSize; // 20
  static const int kSize = kCodeCacheOffset + kIntSize; // 24
  
  
  // Bit positions for bit field.
  static const int kUnused = 0;  // To be used for marking recently used maps.
  static const int kHasNonInstancePrototype = 1;
  static const int kIsHiddenPrototype = 2;
  static const int kHasNamedInterceptor = 3;
  static const int kHasIndexedInterceptor = 4;
  static const int kIsUndetectable = 5;
  static const int kHasInstanceCallHandler = 6;
  static const int kNeedsAccessCheck = 7;
}

0                                                32
+-------------------------------------------------+  <- HeapObject
|                    MapWord                      |
+-------------------------------------------------+  <- Map
| size | type | used property fields | bit field  |instance attributes
+-------------------------------------------------+
|                   prototype                     |
+-------------------------------------------------+
|                  constructor                    |
+-------------------------------------------------+
|              instance descriptors               |
+-------------------------------------------------+
|                  code cache                     |
+-------------------------------------------------+

在V8中,类本身并不声明空间,而是通过申请一块内存,转换成类的类型,然后通过宏直接往这块地址写字节。

类的作用是来方便的”管理”这块内存。

HeapObject的首字节为一个MapWord,通过meta_map()来生成。

namespace v8 { namespace internal {
    
  class Heap : public AllStatic {
    static Object* meta_map_;
  
    static Object* meta_map() {
      return meta_map_;
    }
  }
    
  class Factory : public AllStatic {
    static Handle<Object> meta_map() {
      return Handle<Object>(&Heap::meta_map_);
    }
  }

}}

其他一些类型

Obball: describes objects null, undefined, true, and false.
0                                                32
+-------------------------------------------------+  <- HeapObject+Map
|                       32x6                      |
+-------------------------------------------------+  <- Oddball
|                      toString                   |
+-------------------------------------------------+
|                      toNumber                   |
+-------------------------------------------------+

undefined_value_ = Allocate(oddball_map(), CODE_SPACE);

*****************************************************

Script: describes a script which has been added to the VM.
0                                                32
+-------------------------------------------------+  <- HeapObject+Map
|                       32x6                      |
+-------------------------------------------------+  <- Script
|                      source                     |
+-------------------------------------------------+
|                   line_offset                   |
+-------------------------------------------------+
|                  column_offset                  |
+-------------------------------------------------+
|                     wrapper                     |
+-------------------------------------------------+
|                      type                       |
+-------------------------------------------------+

  // [source]: the script source.
  DECL_ACCESSORS(source, Object)

  // [name]: the script name.
  DECL_ACCESSORS(name, Object)

  // [line_offset]: script line offset in resource from where it was extracted.
  DECL_ACCESSORS(line_offset, Smi)

  // [column_offset]: script column offset in resource from where it was
  // extracted.
  DECL_ACCESSORS(column_offset, Smi)

  // [wrapper]: the wrapper cache.
  DECL_ACCESSORS(wrapper, Proxy)

  // [type]: the script type.
  DECL_ACCESSORS(type, Smi)

*****************************************************

Proxy: describes objects pointing from JavaScript to C structures.
0                                                32
+-------------------------------------------------+  <- HeapObject+Map
|                       32x6                      |
+-------------------------------------------------+  <- Proxy
|                       proxy                     |
+-------------------------------------------------+

*****************************************************

JSObject: describes real heap allocated JavaScript objects with properties.
0                                                32
+-------------------------------------------------+  <- HeapObject+Map
|                       32x6                      |
+-------------------------------------------------+  <- JSObject
|                    properties                   |
+-------------------------------------------------+
|                     elements                    |
+-------------------------------------------------+

*****************************************************

JSFunction: describes JavaScript functions.
0                                                32
+-------------------------------------------------+  <- HeapObject+Map+JSObject
|                       32x6                      |
+-------------------------------------------------+  <- JSFunction
|          prototype or initial map               |
+-------------------------------------------------+
|            shared function info                 |
+-------------------------------------------------+
|                  context                        |
+-------------------------------------------------+
|                 literals                        |
+-------------------------------------------------+

  // [prototype_or_initial_map]:
  DECL_ACCESSORS(prototype_or_initial_map, Object)

  // [shared_function_info]: The information about the function that
  // can be shared by instances.
  DECL_ACCESSORS(shared, SharedFunctionInfo)
  
  // Layout of the literals array.
  static const int kLiteralsPrefixSize = 3;
  static const int kLiteralObjectFunctionIndex = 0;
  static const int kLiteralRegExpFunctionIndex = 1;
  static const int kLiteralArrayFunctionIndex = 2;


If you liked this post, you can share it with your followers !