Hi,
I was recently taking a look into the index data files (for reasons) and came across something that I found strange...
When numeric values are in nodes they are put into the index with min/max and distinct token values, which is cool.
What's strange is, when negative integer values are in text and attribute nodes the index contains the minimum value correctly, but the maximum value is '4.9E-324' ([0,0,0,0,0,0,0,1]).
This doesn't seem to happen with positive values.
Now, with small value ranges I assume this is okay, but with many values I would imagine it could slow things down.
Not sure if this is a bug or a feature, so I figured I'd bring it up.
Here an example:
<r>
<a>-1</a>
<b>0</b>
<c>1</c>
<d>-50000</d>
<d>-49000</d>
<e>2</e>
<e>3</e>
<f a="-1"/>
<g>-1</g>
<g>1</g>
</r>
I would have assumed that the index would see that element "d" has a min of -50000 and a max of -49000.
Here the index infos:
Elements
- Structure: Hash
- Entries: 8
g 2x, 2 distinct integers [-1, 1], leaf
e 2x, 2 distinct integers [2, 3], leaf
d 2x, 2 distinct integers [-50000, 4.9E-324], leaf
c 1x, integer [1, 1], leaf
f 1x, leaf
r 1x
a 1x, integer [-1, 4.9E-324], leaf
b 1x, integer [0, 4.9E-324], leaf
Attributes
- Structure: Hash
- Entries: 1
a 1x, integer [-1, 4.9E-324], leaf
If it's a feature, then cool. Keep on rockin'! If not, then I hope this helps a little.
Thanks,
Zack Dean
Hi Zack,
I helps indeed! I learnt that -Double.MAX_VALUE is smaller than Double.MIN_VALUE in Java. The fix turned out to be pretty straightforward [1]; a new stable snapshot is available [2].
Have fun, Christian
[1] https://github.com/BaseXdb/basex/issues/1616 [2] http://files.basex.org/releases/latest/
On Mon, Aug 27, 2018 at 10:41 PM Zachary N. Dean contact@zadean.com wrote:
Hi,
I was recently taking a look into the index data files (for reasons) and came across something that I found strange...
When numeric values are in nodes they are put into the index with min/max and distinct token values, which is cool…
What's strange is, when negative integer values are in text and attribute nodes the index contains the minimum value correctly, but the maximum value is '4.9E-324' ([0,0,0,0,0,0,0,1]).
This doesn't seem to happen with positive values.
Now, with small value ranges I assume this is okay, but with many values I would imagine it could slow things down.
Not sure if this is a bug or a feature, so I figured I'd bring it up.
Here an example:
<r>
<a>-1</a>
<b>0</b>
<c>1</c>
<d>-50000</d>
<d>-49000</d>
<e>2</e>
<e>3</e>
<f a="-1"/>
<g>-1</g>
<g>1</g>
</r>
I would have assumed that the index would see that element "d" has a min of -50000 and a max of -49000.
Here the index infos:
Elements
Structure: Hash
Entries: 8
g 2x, 2 distinct integers [-1, 1], leaf
e 2x, 2 distinct integers [2, 3], leaf
d 2x, 2 distinct integers [-50000, 4.9E-324], leaf
c 1x, integer [1, 1], leaf
f 1x, leaf
r 1x
a 1x, integer [-1, 4.9E-324], leaf
b 1x, integer [0, 4.9E-324], leaf
Attributes
Structure: Hash
Entries: 1
a 1x, integer [-1, 4.9E-324], leaf
If it's a feature, then cool. Keep on rockin'! If not, then I hope this helps a little.
Thanks,
Zack Dean
Hi all,
This constant's name is somhow misleading, Because it seems to contain the smallest positive value actually, Not the biggest negative one [1] :
[1] https://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#MIN_VALUE
Hoping it helps,
Best regards, Fabrice
-----Message d'origine----- De : BaseX-Talk [mailto:basex-talk-bounces@mailman.uni-konstanz.de] De la part de Christian Grün Envoyé : mardi 28 août 2018 09:07 À : Zachary N. Dean Cc : BaseX Objet : Re: [basex-talk] Strange index values with numerics
Hi Zack,
I helps indeed! I learnt that -Double.MAX_VALUE is smaller than Double.MIN_VALUE in Java. The fix turned out to be pretty straightforward [1]; a new stable snapshot is available [2].
Have fun, Christian
[1] https://github.com/BaseXdb/basex/issues/1616 [2] http://files.basex.org/releases/latest/
On Mon, Aug 27, 2018 at 10:41 PM Zachary N. Dean contact@zadean.com wrote:
Hi,
I was recently taking a look into the index data files (for reasons) and came across something that I found strange...
When numeric values are in nodes they are put into the index with min/max and distinct token values, which is cool…
What's strange is, when negative integer values are in text and attribute nodes the index contains the minimum value correctly, but the maximum value is '4.9E-324' ([0,0,0,0,0,0,0,1]).
This doesn't seem to happen with positive values.
Now, with small value ranges I assume this is okay, but with many values I would imagine it could slow things down.
Not sure if this is a bug or a feature, so I figured I'd bring it up.
Here an example:
<r>
<a>-1</a>
<b>0</b>
<c>1</c>
<d>-50000</d>
<d>-49000</d>
<e>2</e>
<e>3</e>
<f a="-1"/>
<g>-1</g>
<g>1</g>
</r>
I would have assumed that the index would see that element "d" has a min of -50000 and a max of -49000.
Here the index infos:
Elements
Structure: Hash
Entries: 8
g 2x, 2 distinct integers [-1, 1], leaf
e 2x, 2 distinct integers [2, 3], leaf
d 2x, 2 distinct integers [-50000, 4.9E-324], leaf
c 1x, integer [1, 1], leaf
f 1x, leaf
r 1x
a 1x, integer [-1, 4.9E-324], leaf
b 1x, integer [0, 4.9E-324], leaf
Attributes
Structure: Hash
Entries: 1
a 1x, integer [-1, 4.9E-324], leaf
If it's a feature, then cool. Keep on rockin'! If not, then I hope this helps a little.
Thanks,
Zack Dean
basex-talk@mailman.uni-konstanz.de