Solr indexing
Docs and sources:
Basic syntax
printf '<add>
<doc>
<field name="id">Car 1</field>
<field name="name">VW Golf hatchback</field>
</doc>
<doc>
<field name="id">Car 2</field>
<field name="name">Toyota Corolla hatchback</field>
</doc>
<doc>
<field name="id">Car 3</field>
<field name="name">Toyota Auris Touring</field>
</doc>
</add>' | curl -X POST -H "Content-Type: text/xml" --data-binary @- "http://localhost:8983/solr/mycore/update?commit=true"
More complex example
<update>
<add>
<doc>
<field name="id">Car 1</field>
<field name="name">VW Golf hatchback</field>
</doc>
<doc>
<field name="id">Car 2</field>
<field name="name">Toyota Corolla hatchback</field>
</doc>
<doc>
<field name="id">Car 3</field>
<field name="name">Toyota Auris Touring</field>
</doc>
</add>
<delete>
<id>Car 1</id>
</delete>
<commit />
<optimize />
</update>
printf '<update>
<add>
<doc>
<field name="id">Car 1</field>
<field name="name">VW Golf hatchback</field>
</doc>
<doc>
<field name="id">Car 2</field>
<field name="name">Toyota Corolla hatchback</field>
</doc>
<doc>
<field name="id">Car 3</field>
<field name="name">Toyota Auris Touring</field>
</doc>
</add>
<delete>
<id>Car 1</id>
</delete>
<commit />
<optimize />
</update>' | curl -X POST -H "Content-Type: text/xml" --data-binary @- "http://localhost:8983/solr/mycore/update"
Alternatives
There are also the official clients for Java, JavaScript, Python and Ruby
Solr also supports other data formats (like JSON), however XML seems to be the most popular
There's also a special syntax for partial updates, where we want to update just a few fields, while leaving others untouched: Partial Updates
Why my changes are not visible? Commit changes
In order to make the data visible while searching, you have to commit changes.
Soft commit (for visibility, faster)
- makes the documents visible for searches
Hard commit (for durability, slower)
- ensures index files (segments) are fully written to the disk (fsync)
- closes transaction log and opens a new one
How to commit data?
- Manually: use the URL parameter
?commit=true
or?softCommit=true
in update requests. - Manually: use command
<commit />
in update XML (hard commit) - Automatically: define
<autoCommit />
and/or<autoSoftCommit />
insolrconfig.xml
; you can setmaxDocs
,maxTime
,maxSize
Transaction Log (tlog)
Log of all updates since the last hard commit. If Solr is not shutdown gracefully, it will replay updates from the tlog at startup.