Hi,

I was recently taking a look into the index data files (for reasons) and came across something that I found strange...

 

When numeric values are in nodes they are put into the index with min/max and distinct token values, which is cool…

What's strange is, when negative integer values are in text and attribute nodes the index contains the minimum value correctly, but the maximum value is '4.9E-324' ([0,0,0,0,0,0,0,1]).

This doesn't seem to happen with positive values.

Now, with small value ranges I assume this is okay, but with many values I would imagine it could slow things down.

 

Not sure if this is a bug or a feature, so I figured I'd bring it up.

 

Here an example:

<r>

  <a>-1</a>

  <b>0</b>

  <c>1</c>

  <d>-50000</d>

  <d>-49000</d>

  <e>2</e>

  <e>3</e>

  <f a="-1"/>

  <g>-1</g>

  <g>1</g>

</r>

 

I would have assumed that the index would see that element "d" has a min of -50000 and a max of -49000.

 

Here the index infos:

 

Elements

- Structure: Hash

- Entries: 8

  g  2x, 2 distinct integers [-1, 1], leaf

  e  2x, 2 distinct integers [2, 3], leaf

  d  2x, 2 distinct integers [-50000, 4.9E-324], leaf

  c  1x, integer [1, 1], leaf

  f  1x, leaf

  r  1x

  a  1x, integer [-1, 4.9E-324], leaf

  b  1x, integer [0, 4.9E-324], leaf

 

Attributes

- Structure: Hash

- Entries: 1

  a  1x, integer [-1, 4.9E-324], leaf

 

 

If it's a feature, then cool. Keep on rockin'! If not, then I hope this helps a little.

 

Thanks,

 

Zack Dean