Understanding the Types of Attribute of Data

Posted by

Dataset

A dataset is composed of data objects, and a data object represents an entity(a concept or unit of meaningful information).

In a database, a row corresponds to a data object, and a column signifies an attribute. An attribute represents the characteristics or features of a data object.

Attributes are also data items used in machine learning. They are also referred to as variables, fields, or predictors.

Understanding the types of attributes is the first step in data preprocessing. Different types of attributes are distinguished and then the data is preprocessed accordingly. Attributes can be divided into qualitative and quantitative types.

  • Qualitative: Nominal, Ordinal, Binary
  • Quantitative: Numeric, Discrete, Continuous

Type of Attribute

Qualitative

  • Nominal Attributes(Related to names): The value of a nominal attribute is the name or symbol of an object. Since nominal attribute values represent a category or state, they are also called categorical attributes, and there is no order(rank, position) between nominal attribute values.
  • Ordinal Attributes: They include values with meaningful order or rank, but the magnitude between values is not known. What’s important is the order of values, not the degree of importance.
  • Binary Attributes: Binary data has only two values or states.
    • Symmetric: Both values are equally important.
    • Asymmetric: One of the two values is more important than the other.

Quantitative

  • Numeric Attributes: Numeric attributes are quantitative because they represent measurable quantities. Numeric attributes have two types: interval and ratio.
    • Interval Scale: Attributes have values, and their difference is interpretable, but there is no correct reference point or zero point for numeric attributes. Data can be added or subtracted on an interval scale but cannot be multiplied or divided. Temperature can be considered as an example. If the temperature of one day is twice that of another day, it cannot be said that one day is twice as hot as the other.
    • Ratio Scale: Attributes are numeric attributes with a fixed zero point. If the measurement is on a ratio scale, values can be described as multiples(or ratios) of other values. Values are ordered, and differences between values can be calculated. The mean, median, mode, interquartile range, and five-number summary can be provided.
  • Discrete Attributes: Discrete data has a finite number of values and can be numeric or categorical. These attributes have a finite or countably infinite set of values.
  • Continuous Attributes: Continuous data has an infinite number of states. Continuous data is of float type. There can be many values between 2 and 3.

Leave a Reply

이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다