Terminalfour Search

Used to add Terminalfour Search to a page for site search results.

Screenshot showing example of Terminalfour Search content type.

Content Type Details

ID: 326
Name: Terminalfour Search
Minimum user level: Administrator
Compatible with page layouts: Full Width

Content Type Elements Details

Name	Description	Size	Type	Required	Conditionally Shown
Name	The Name element	80 Characters	Plain Text	Yes	No
Title	Add the title, this is the main H1 heading on the page	80 Characters	Plain Text	Yes	No
No Results Content	Add content that is displayed when no results are returned. See WYSIWYG content options.	1000 Characters	HTML	Yes	No

Overview

Terminalfour Search is powered by three distinct components:

Terminalfour Search Crawler: Responsible for visiting pages and fetching content.
Site Search Dashboard: Responsible for the user interface, ranking, and display logic.
Terminalfour Search content type: Add the Search Results page to the website.

User Interface & Implementation

The Site Search is accessed via the magnifying glass icon in the website header. Clicking this reveals an overlay with a search input.

Front-end Code: Located in the Handlebars Partial hccHeaderHTML.
Live Example: See the Site Search page.

Data Collection (The Crawler)

The Terminalfour Search Crawler is configured to crawl the website daily to fetch metadata and content.

Configuration: Managed via the houston-cc-main configuration, which includes URL settings, exclusions, and metadata mappings.
Indexing Logic: The crawler must be able to "find" a page to index it. If a page is hidden from navigation and has no inbound links, the crawler will not discover it.
Robots.txt: The robots.txt file added by the Robots File content type is configured to allow the crawler access. For example:
User-agent: terminalfour-nutch-spider Allow: / Crawl-delay: 0.5

Metadata

The crawler picks up metadata from pages. For a full list of how fields are assigned, see metadata.

Automatic Data Mapping

The crawler automatically collects and pushes the following data fields to Site Search, which you will see in the Site Search Dashboard:

Field	Description
host	Host name of the URL
url	The full page URL
Id	The unique identifier (URL)
content	The content extracted from the body of the page
tstamp	Timestamp of when the URL was last fetched
urlDepth	How many clicks deep the page is from the root
url_keywords	Keywords extracted from the URL string

Content Exclusion & Visibility

There are several ways to prevent a page from being indexed or appearing in search results, depending on your access level:

Method	Effect	Access Level
'Remove from crawl' element	Check the 'Remove from crawl' element on a sections 'General' tab, this adds a `robots noindex` meta tag. Page is not indexed, but links on the page are still followed.	Moderator
'Hide from search' element	Check the 'Remove from crawl' element on a sections 'General' tab, this adds a `sectionDisplay false` meta tag. The page is crawled, but hidden from the search results interface.	Moderator
Metadata Tab 'Robots' element	Set a custom robots meta tag e.g., `noindex, nofollow`, would prevent indexing and prevent the crawler from following links.	Moderator
Robots.txt Disallow rules	Add disallow rules for the crawler to the robots.txt file using the Robots File content type to prevent the crawler from visiting specific areas/folders entirely e.g., `Disallow: /component-library/*`	Administrator
URL Filter Regex	Add exclusion rules directly in the Crawler settings.	Crawler Access

Search Management & Customization

The functionality of search is managed within the Site Search Dashboard. This is where settings and behavior are fully customizable and fine-tuned.

Searchable Fields: Define which metadata fields are indexed for searching and which are returned to be displayed in the results list
Ranking: Adjust weighted ranking criteria e.g., to give more weight to specific metadata fields
Search Rules: Create custom logic to improve results:
- Promotions: Pin specific internal or external pages to the top of results for certain keywords.
- Synonyms: Group related terms (e.g., "Student" and "Learner") so they return the same results.
- Stop Words: Define common words to be ignored by the search engine to improve accuracy.
Facets (Filters): Manage the sidebar filters that allow users to narrow down results. There is currently a single facet for Type, which is mapped from the sectionType meta tag.