Securing eZ Find's Solr installation
When using eZ Publish's eZ Find extension on a public facing site or project -- arguably any project -- it is vital to secure it to prevent unauthorized access and potential data loss. eZ Find is powered by Apache Solr, an open-source search server based on the Lucene Java search library. Its power and flexibility make eZ Find a great tool when working with a lot of content in eZ Publish.
With great power comes great responsibility as they say. This holds true in this case as well, although in a slightly different way.
eZ Publish websites interact with Solr via HTTP calls. Solr's configuration is wide open by default; consider a quote from https://wiki.apache.org/solr/SolrSecurity:
"First and foremost, Solr does not concern itself with security either at the document level or the communication level. It is strongly recommended that the application server containing Solr be firewalled such the only clients with access to Solr are your own. A default/example installation of Solr allows any client with access to it to add, update, and delete documents (and of course search/read too), including access to the Solr configuration and schema files and the administrative user interface." (emphasis added)
The canonical risk? A remote anonymous user can trivially clear a site's index if Solr is not secured prior to deployment.
eZ Find 2.1 to eZ Find LS 5.2 use Solr 3.5.x/3.6.x, which uses Jetty 6.x for its HTTP stack. In this post we'll use Jetty 6.x.
eZ Find 5.3 introduced Solr 4.7.x, which uses Jetty 8.x. The configuration for Jetty 8.x differs slightly and won't be covered here. See the links section for further reading.
Here's how to secure your Solr search installation with basic authentication. In short, you will be configuring a username and password for only eZ Publish to be able to update and otherwise manage the search index. Using this same username and password, remote developers can still access the Solr admin panel for debugging purposes.
The UserRealms block is usually already in jetty.xml, just commented out. Uncomment it as shown below.
The refreshInterval setting determines how often (in seconds) the realm.properties file is checked for updates.
<Set name="UserRealms"> <Array type="org.mortbay.jetty.security.UserRealm"> <Item> <New class="org.mortbay.jetty.security.HashUserRealm"> <Set name="name">Solr Admin Auth</Set> <Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set> <Set name="refreshInterval">0</Set> </New> </Item> </Array> </Set>
Add the following code after </locale-encoding-mapping-list>
The realm-name parameter in login-config needs to exactly match the name set in jetty.xml.
<login-config> <auth-method>BASIC</auth-method> <realm-name>Solr Admin Auth</realm-name> </login-config> <security-constraint> <web-resource-collection> <web-resource-name>Solr Admin Auth</web-resource-name> <url-pattern>/admin/*</url-pattern> <url-pattern>/evu/admin/*</url-pattern> <url-pattern>/webcrawl/admin/*</url-pattern> </web-resource-collection> <auth-constraint> <role-name>admin_user</role-name> </auth-constraint> <!-- for use when solr accessed via https <user-data-constraint> <transport-guarantee>CONFIDENTIAL</transport-guarantee> </user-data-constraint> --> </security-constraint> <security-constraint> <web-resource-collection> <web-resource-name>Solr Update Auth</web-resource-name> <url-pattern>/update/*</url-pattern> <url-pattern>/evu/update/*</url-pattern> <url-pattern>/webcrawl/update/*</url-pattern> </web-resource-collection> <auth-constraint> <role-name>admin_user</role-name> <role-name>update_user</role-name> </auth-constraint> <!-- for use when solr accessed via https <user-data-constraint> <transport-guarantee>CONFIDENTIAL</transport-guarantee> </user-data-constraint> --> </security-constraint> <security-constraint> <web-resource-collection> <web-resource-name>Disable TRACE</web-resource-name> <url-pattern>/</url-pattern> <http-method>TRACE</http-method> </web-resource-collection> <auth-constraint/> </security-constraint>
Add the user and role credentials. The update_user role is optional.
Different encryption/hashing options are available, and you can do some further reading about them. We'll stick with plain text for illustration purposes.
# user: password, role-name, role-name ... # solradm: solradmpass, admin_user solrupd: solrupdpass, update_user
The credentials set here need to match those in solr.ini covered in the next step.
Stronger username and password combinations should be chosen for a real world setup obviously.
Update the following settings in the [SolrBase] block.
This step is vital, because it provides the authentication credentials to eZ Publish. Without the credentials, eZ Publish will no longer be able to query Solr.
SearchServerAuthentication=enabled SearchServerAuthenticationMethod=basic SearchServerUserPass=solradm:solradmpass
Clear eZ Publish's ini cache
cd <ezpublish project root> php bin/php/ezcache --clear-tag=ini
The command(s) required for this step might vary depending on your OS and Solr service startup method.
If you are running Solr via one of the startup scripts for Debian, Gentoo and RedHat/CentOS provided as part of the eZ Find extension, the following will do.
sudo /etc/init.d/solr restart
The startup scripts are available in extension/ezfind/scripts/(debian|gentoo|rhel)/solr and provide options to start, stop, reload, restart and check the service status.
Setup instructions for these scripts are available at top of each script file.
Note: These rules assume that Solr is running on the same machine as your eZ Publish installation.
The general idea of this approach is to implement networking rules with iptables so that only the local server is allowed to manage the search index.
With this comment, you are configuring iptables to only allow the local machine to access port 8983, where Solr runs by default.
iptables -A INPUT -p tcp -i eth0 ! -s 127.0.0.1 --dport 8983 -j DROP iptables-save
While this does secure Solr's administration interface to anyone remote, it also denies access to developers that may need to access the Solr admin panel for debugging. To get around this issue, we can create an SSH tunnel from our developer machine to the remote server.
# Set up the tunnel in the background ssh -N -C -q -f -L 127.0.0.1:8983:127.0.0.1:8983 <ssh-account>@<domain> # -- or -- # Setup the tunnel and SSH in ssh -L 127.0.0.1:8983:127.0.0.1:8983 <ssh-account>@<domain>
The first port number is the local port, the one you would like to tunnel from on your machine. The second port is the destination port, where you would like to tunnel to.
To clear the tunnel, use netstat -tulpn, then kill the relevant process based on the process ID.
While iptables is arguably the better option as it locks down access at a higher level, it does come with some inconveniences for the developer. Having to tunnel in or configure iptables to allow access to a number of developer IPs, which may change over time, adds ongoing management overhead.
The iptables setup is also usually not part of the site's code repository, which means that if the site is moved to a new host and the iptables rules aren't moved along with it, Solr would be vulnerable again.
The BasicAuth setup using Jetty's UserRealms on the other hand would be part of the code base and would safeguard the Solr install in such a case.
Ideally, both methods should be used in conjunction to avoid nasty surprises.
Automated security audit tool
As a footnote: We have built and made public a scanning tool to detect various problem, especially with eZ Publish sites. It includes detection of a vulnerable Solr instance.
The tool is here: http://audit.mugo.ca/