An entity in computer science can refer to a number of different concepts, depending on the context. Within programming, entities can either refer to an HTML entity, an entity-relationship mapping in a database, or as part of a programming paradigm, like entity-component systems often found in games development.
More generally, an entity refers to a piece of information, whether that be a character, a word, an object, or some other abstract unit. Exactly what encompasses an entity and its uses are dependent on context. Entities are used in a related manner by Google to provide richer search results and related content.
Importance of entities in SEO
Google defines an entity as: "A thing or concept that is singular, unique, well-defined, and distinguishable." This is a very broad definition, but it is important within the realm of SEO. Unfortunately, Google does not publish the exact details of their use of entities, so information largely gleams from patents, information provided by Google developers, and experimentation.
Though the exact workings of entities are unknown, there are methods to help Google understand the content on your website better. This form of SEO involves adding structured data to pages. There are several different formats that can be used to provide structured data, such as JSON-LD, microdata, and RDFa. These different formats all provide a schema of the pertinent information on a page.
Structured data can specify the type of content on the page (e.g. a recipe, a news article, a movie review), any people mentioned (people within the page or the author of the page), when the page was published, and other miscellaneous data. The information is highly dependent on the page. As an example, a page containing a recipe may include the prep time, cooking time, ingredients, recipe yield, and ingredients within its structured data.
Google does not guarantee that provided structured data will be used in search results. It ranks this information on 3 major factors: content, relevance, and completeness. Content that is up-to-date and high quality, which is highly relevant to the page content and is not missing any pertinent information, will be more likely to be taken into consideration by Google and included in search results.
How Google uses entities in search
In search, an entity can seem similar to a keyword at first glance, and they both relate to the content on a webpage. However, where a keyword is bound by language, an entity can be anything individually distinguishable: a brand, a person, a place, a concept, and more.
Entities are used to understand concepts and their relations in content. For example, someone may search for "presidents of the United States of America". All sorts of content may be returned for this search term. Each individual president may have a wealth of information stored about them. This information may be spread out across many mediums - pictures, videos, audio, and text data. All of this information may be considered under the umbrella entity of each individual president.
By considering data in terms of entities, Google can gain a deeper understanding of content and provide more relevant search results. In this way, each entity can be considered a node, and these nodes are linked together by relationships. Information like birth dates, death dates, notable accomplishments, other presidents, and world leaders may all be related to these nodes in some way, and this information can be included in search results.
Google’s Knowledge Graph as an example of the application of entities
Google's application of this concept is called the Google Knowledge Graph. Unfortunately, Google does not publish information about the inner workings of the Knowledge Graph, so publicly available data regarding the exact implementation of Knowledge Graph and how it relates to SEO is limited.
Google Knowledge Graph information is used to provide extra content to search results. So, for example, search results for "Thomas Jefferson" would include a box with pertinent data about him: images, notable accomplishments, a short blurb, and other biographical data, like birth and death dates. Related searches, such as other presidents and statesmen from his era, are also listed.
Screenshot with Google Knowledge Graph of google.com
This information is drawn from the Google Knowledge Graph. Thomas Jefferson is an entity, and images, biographical data, and related searches all form part of the entity. While its exact implementation is unknown, there are some clues to how it works.
A concept called 'co-occurrence' is important in how Google forms relationships between different entities. Co-occurrence refers to how frequently two different entities are linked together in some way, such as how often two different names are both mentioned together. In this way, Google may 'learn' about the relationship between a president and vice president by how frequently their names are mentioned together.