Adding search to a middleman blog
slightly simplier than google
- tags
- middleman
- ruby
Contents
We’re going to build a simple, niave search for middleman blogs. We’re going to build a search index at build time, and then use that index to perform the search itself on the client side.
Building the index
When you typed in something in google, it doesn’t then go and hit every page on the internet to check to see if there’s a match. It doesn’t even look at every page that it has squirreled away somewhere in the googleplex. What it consults is an index of the documents out there, and the index points to the page information. (We all know that it’s a lot more complicated than that really, but run with it.)
First thing we’re going to do is create a very simple version of this index for your site. This is going to be in a file called source/article.index.json.erb
.
- Go through all of the articles.
- Add meta data for the article into the master map.
- Find all of the words in the article, by stripping out all of the html tags, making things lowercase, and breaking it apart by white space.
- Insert all of those words into our index.
- Convert the whole sucker to JSON.
<%
map = {articles:{}}
index = {}
blog.articles.each do |article|
map[:articles][article.url] = {
title: article.title,
date: article.date,
tags: article.tags
}
words = "#{article.title} #{article.body}"
words = words
.downcase # make lowercase
.gsub( /<.*?>/, "" ) # get rid of tags
.gsub( /[^\w ]/, "" ) # get rid of not letters
.split( /\s+/ ) # split by words
.sort.uniq
words.each do |w|
index[w] ||= []
index[w] << article.url
end
end
map[:index] = index
%>
<%= map.to_json %>
Now lets add some markup to the blog
I’m sticking this in the header, as you see above:
%form.navbar-form.navbar-right#search{ role: "search" }
.form-group.dropdown
.input-group
%input.form-control#search_box{ type: "text", placeholder: "Search", autocomplete: "off" }
%span.input-group-btn
%button.btn.btn-default
%span.glyphicon.glyphicon-search
%ul.dropdown-menu.dropdown-menu-left.results
= link_to "Title", "/url"
Loading and Querying the index
Ok, lets build this from the ground up. All this goes into application.js
.First we’re create a method that loads up the index if we need it. We’re going to use a promise here, so if multiple request come in at the same time only one will go to the server:
|
|
This is called like article_index().done( function( index ) { console.log( index )})
The second time it calls, it returns the first promise again so everything is nice and fast.
To query the index itself we need to look through all of the words and return a list of urls that match:
|
|
Now lets build a simple search. This is a little complicated, since we need to compute the intersection of the results if the user types in multiple words. Here’s what’s happening:
- We create a promise, since we may need wait for the index to load.
- We split the search term into multiple words.
- Collect the results of the
match_index
function. - Compute the intersections of all the results
- Look up the meta data based on the url.
- Resolve the promise with the results.
|
|
Wiring it up
First we need to call our code when the user inputs something in the text area:
|
|
Then we wire everything together:
- If the field is empty, hide the dropdown.
- Otherwise show the dropdown and a loading message
- Call
find_article
and when it returns - Put the results in the result dropdown.
|
|
Next steps
- Language stemming
- Logical operations
- Showing more metadata in the search results.
Previously
Next