Terminalfour Search
Used to add Terminalfour Search to a page for site search results.
Screenshot showing example of Terminalfour Search content type. Content Type Details
- ID: 326
- Name: Terminalfour Search
- Minimum user level: Administrator
- Compatible with page layouts: Full Width
Content Type Elements Details
| Name | Description | Size | Type | Required | Conditionally Shown |
|---|---|---|---|---|---|
| Name | The Name element | 80 Characters | Plain Text | Yes | No |
| Title | Add the title, this is the main H1 heading on the page | 80 Characters | Plain Text | Yes | No |
| No Results Content | Add content that is displayed when no results are returned. See WYSIWYG content options. | 1000 Characters | HTML | Yes | No |
Overview
Terminalfour Search is powered by three distinct components:
- Terminalfour Search Crawler: Responsible for visiting pages and fetching content.
- Site Search Dashboard: Responsible for the user interface, ranking, and display logic.
- Terminalfour Search content type: Add the Search Results page to the website.
User Interface & Implementation
The Site Search is accessed via the magnifying glass icon in the website header. Clicking this reveals an overlay with a search input.
- Front-end Code: Located in the Handlebars Partial
hccHeaderHTML. - Live Example: See the Site Search page.
Data Collection (The Crawler)
The Terminalfour Search Crawler is configured to crawl the website daily to fetch metadata and content.
- Configuration: Managed via the
houston-cc-mainconfiguration, which includes URL settings, exclusions, and metadata mappings. - Indexing Logic: The crawler must be able to "find" a page to index it. If a page is hidden from navigation and has no inbound links, the crawler will not discover it.
- Robots.txt: The
robots.txtfile added by the Robots File content type is configured to allow the crawler access. For example:User-agent: terminalfour-nutch-spider
Allow: /
Crawl-delay: 0.5
Metadata
The crawler picks up metadata from pages. For a full list of how fields are assigned, see metadata.
Automatic Data Mapping
The crawler automatically collects and pushes the following data fields to Site Search, which you will see in the Site Search Dashboard:
| Field | Description |
| host | Host name of the URL |
| url | The full page URL |
| Id | The unique identifier (URL) |
| content | The content extracted from the body of the page |
| tstamp | Timestamp of when the URL was last fetched |
| urlDepth | How many clicks deep the page is from the root |
| url_keywords | Keywords extracted from the URL string |
Content Exclusion & Visibility
There are several ways to prevent a page from being indexed or appearing in search results, depending on your access level:
| Method | Effect | Access Level |
| 'Remove from crawl' element | Check the 'Remove from crawl' element on a sections 'General' tab, this adds a robots noindex meta tag. Page is not indexed, but links on the page are still followed. |
Moderator |
| 'Hide from search' element | Check the 'Remove from crawl' element on a sections 'General' tab, this adds a sectionDisplay false meta tag. The page is crawled, but hidden from the search results interface. |
Moderator |
| Metadata Tab 'Robots' element | Set a custom robots meta tag e.g., noindex, nofollow, would prevent indexing and prevent the crawler from following links. |
Moderator |
| Robots.txt Disallow rules | Add disallow rules for the crawler to the robots.txt file using the Robots File content type to prevent the crawler from visiting specific areas/folders entirely e.g., Disallow: /component-library/* |
Administrator |
| URL Filter Regex | Add exclusion rules directly in the Crawler settings. | Crawler Access |
Search Management & Customization
The functionality of search is managed within the Site Search Dashboard. This is where settings and behavior are fully customizable and fine-tuned.
- Searchable Fields: Define which metadata fields are indexed for searching and which are returned to be displayed in the results list
- Ranking: Adjust weighted ranking criteria e.g., to give more weight to specific metadata fields
- Search Rules: Create custom logic to improve results:
- Promotions: Pin specific internal or external pages to the top of results for certain keywords.
- Synonyms: Group related terms (e.g., "Student" and "Learner") so they return the same results.
- Stop Words: Define common words to be ignored by the search engine to improve accuracy.
- Facets (Filters): Manage the sidebar filters that allow users to narrow down results. There is currently a single facet for Type, which is mapped from the
sectionTypemeta tag.