Semi-structured data refers to data that is not stored in a tabular format, but still has some level of hierarchy and separation between fields within a data record. It is an intermediate between structured data (such as a database or spreadsheet) and unstructured data (such as a text document or image file).
Examples of semi-structured data include crawled web pages stored as XML files, or JSON documents recording machine logs. These types of data often contain elements that are structured in some way (such as tags or labels), but do not follow a strict schema or data model. Sensors, server and application logs, and clickstream are other typical sources of semi-structured data.