You could use REPLACE instead of ADD (or db:replace instead of db:add) and name your tweet by the JSON id. For more details, have a look at our documentation [1].
Deleting duplicates after the insertion would be another approach, but it surely is too slow if your plan is to store thousands or millions of tweets.
[1] http://docs.basex.org/wiki/Database_Module#db:replace
thufir hawat.thufir@gmail.com schrieb am Di., 4. Feb. 2020, 07:41:
Not sure of the correct lingo, but I'm building a database of tweets. As I run it, duplicate tweets are added to the database. I can see the duplicates with:
for $tweets in db:open("twitter") return <tweet>{$tweets/json/id__str}</tweet>
Firstly, how would I select the json node for a duplicate entity. But, before even selecting that node, recursively look to see if there's more than one result for that id__str value.
How would I even generate a count of each occurrence for the data of a specific id__str?
thanks,
Thufir