Some differences in mxOPAQUE CLASS and Object metadata, particularly for Strings #31
Loading…
x
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hi!
I was looking at how objects are stored in MAT files, and came across this repository and your explanation in
MatFileHandler/objects.md
. When trying to replicate, I observed some small differences in the object metadata.Some general differences I observed:
fieldContentsID
in Region 4 is numbered starting from 0.(classID, 0, 0, X, Y, objectID)
. I'll get into what X and Y are doing belowSome differences for
string
class:(classID, 0, 0, X, 0, objectID)
.X
here I will call asstringObjectID
, as it started with1
and incremented for eachstring
object in the file. On the contrary, for user defined classesX
is set to0
. Instead, the fifth field (Y
as mentioned earlier) is set starting from1
and incrementing for every user-defined class in the file. I also tried withdatetime
which was using theY
field. My guess is these two fields are some type of internal identifiers for certain categories of objects.string
, Region 2 was present, and structured exactly the same as Region 4 would be, i.e., three 32-bit integers with the format(fieldID, 1, fieldContentsID)
. Only one field for each string object is present. No String related data was present in Region 4. In some examples,fieldID
was set to5
, and in others it was set to1
. Need to take up a few more examples for this.fieldContents
cell for strings was interesting. The array flag for this was set asmxUINT64_CLASS
with dimensions[1, (5+k)]
, wherek
depends on the length of the string. The first four 64-bit integers was set as[1,2,1,1]
, and the fifth integer specified the number of characters in the string. The nextk
columns contained the actual string contents, which is null terminated and padded to 8-byte blocks. However, the content was stored as UTF-16 characters within these 64-bit columns. Hence, each column essentially stores 4 characters.I was looking for some help to decode what's happening with strings here, or if there's something else I could be missing. I'm looking to incorporate more objects/examples to help break this down.
The input data I used:
@foreverallama Hi!
Thanks for the information, this looks very interesting! I noticed some additional things while doing the latest beta with enumerations support, but I didn't update the description in
objects.md
yet. I need some time to think about all this, and to do more investigations.Thanks again!
Sure! I'm also looking into decoding what these flags mean in more detail using other objects, and maybe expand upon the existing documentation as well. I'd be happy to help out along these lines if needed 😄