Understanding Gnocchi Archive Policies

In Gnocchi, metrics can contain archive policies. Archive policies define how the data will be agregated for a metric using different available agregation methods. They also determine how long measures will be kept in a metric.

Gnocchi comes with some pre-built in archive policies. We can list them using the Gnocchi client to get a better understanding of the information they contain:

+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
| name   | back_window | definition                                                            | aggregation_methods             |
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+
| high   |           0 | - points: 3600, granularity: 0:00:01, timespan: 1:00:00               | std, count, min, max, sum, mean |
|        |             | - points: 10080, granularity: 0:01:00, timespan: 7 days, 0:00:00      |                                 |
|        |             | - points: 8760, granularity: 1:00:00, timespan: 365 days, 0:00:00     |                                 |
| medium |           0 | - points: 10080, granularity: 0:01:00, timespan: 7 days, 0:00:00      | std, count, min, max, sum, mean |
|        |             | - points: 8760, granularity: 1:00:00, timespan: 365 days, 0:00:00     |                                 |
| bool   |        3600 | - points: 31536000, granularity: 0:00:01, timespan: 365 days, 0:00:00 | last                            |
| low    |           0 | - points: 8640, granularity: 0:05:00, timespan: 30 days, 0:00:00      | std, count, min, max, sum, mean |
+--------+-------------+-----------------------------------------------------------------------+---------------------------------+

As we can see, each archive policy can have different agregation methods. These methods will be applied to a metric, given that the metric has the archive policy.

Back Window

By default, new measures can only be processed if they have timestamps in the future or if they are part of the last aggregation period. This last aggregation period size is based on the largest granularity defined in the archive policy definition.

Setting a back window allows processing measures that are older than the period. The value for the back window will indicate the number of least periods to keep. This way it is possible to process measures that are older than the last timestamp period boundary.

Archive Policy Definition

An archive policy's definition constitutes three elements:

Field Meaning Examples
points Number of points inside a timespan 12, 48
granularity The time between two measures in an aggregated timeseries of a metric. 1s
timespan How long measures will be kept in the metric 1 day, 1 hour

For example, if an item is defined as 12 points over 1 hour, that indicates one point every 5 minutes. In a similar way, if an item is defined as 1 point every 1 hour over 1 day, this indicates 24 points.

Knowing this, a definition can be determined using at least two elements.

The class responsible for representing a definition is ArchivePolicyItem, which can be found in /gnocchi/archive_policy.py:

class ArchivePolicyItem(dict):
    def __init__(self, granularity=None, points=None, timespan=None):
        if (granularity is not None
           and points is not None
           and timespan is not None):
            if timespan != granularity * points:
                raise ValueError(
                    u"timespan ≠ granularity × points")

        if granularity is not None and granularity <= 0:
            raise ValueError("Granularity should be > 0")

        if points is not None and points <= 0:
            raise ValueError("Number of points should be > 0")

        if granularity is None:
            if points is None or timespan is None:
                raise ValueError(
                    "At least two of granularity/points/timespan "
                    "must be provided")
            granularity = round(timespan / float(points))
        else:
            granularity = float(granularity)

        if points is None:
            if timespan is None:
                self['timespan'] = None
            else:
                points = int(timespan / granularity)
                self['timespan'] = granularity * points
        else:
            points = int(points)
            self['timespan'] = granularity * points

        self['points'] = points
        self['granularity'] = granularity

    @property
    def granularity(self):
        return self['granularity']

    @property
    def points(self):
        return self['points']

    @property
    def timespan(self):
        return self['timespan']

This class can be instantiated in quite a few differnet ways, as seen in ArchivePolicy's constructor:

# ...
elif isinstance(d, dict):
    self.definition.append(ArchivePolicyItem(**d))
elif len(d) == 2:
    self.definition.append(
        ArchivePolicyItem(points=d[0], granularity=d[1]))

Archive policy source code can be found in gnocchi/archive_policy.py

gnocchi/archive_policy.py:

class ArchivePolicy(object):

# ...

    def __init__(self, name, back_window, definition,
                 aggregation_methods=None):
        self.name = name
        self.back_window = back_window
        self.definition = []
        for d in definition:
            if isinstance(d, ArchivePolicyItem):
                self.definition.append(d)
            elif isinstance(d, dict):
                self.definition.append(ArchivePolicyItem(**d))
            elif len(d) == 2:
                self.definition.append(
                    ArchivePolicyItem(points=d[0], granularity=d[1]))
            else:
                raise ValueError(
                    "Unable to understand policy definition %s" % d)

In the constructor above we can see how the archive policy is initialized and how the definitions are added to the object depending on how they were passed in the request.

Archive Policies Controller

The ArchivePoliciesController controller in /gnocchi/rest/__init__.py contains the API's post() method that creates and stores new archive policies:

class ArchivePoliciesController(rest.RestController):

    @pecan.expose('json')
    def post(self):
        # NOTE(jd): Initialize this one at run-time because we rely on conf
        conf = pecan.request.conf
        enforce("create archive policy", {})
        ArchivePolicySchema = voluptuous.Schema({
            voluptuous.Required("name"): six.text_type,
            voluptuous.Required("back_window", default=0): PositiveOrNullInt,
            voluptuous.Required(
                "aggregation_methods",
                default=set(conf.archive_policy.default_aggregation_methods)):
            [ValidAggMethod],
            voluptuous.Required("definition"):
            voluptuous.All([{
                "granularity": Timespan,
                "points": PositiveNotNullInt,
                "timespan": Timespan,
                }], voluptuous.Length(min=1)),
            })

        body = deserialize_and_validate(ArchivePolicySchema)
        # Validate the data
        try:
            ap = archive_policy.ArchivePolicy.from_dict(body)
        except ValueError as e:
            abort(400, e)
        enforce("create archive policy", ap)
        try:
            ap = pecan.request.indexer.create_archive_policy(ap)
        except indexer.ArchivePolicyAlreadyExists as e:
            abort(409, e)

        location = "/archive_policy/" + ap.name
        set_resp_location_hdr(location)
        pecan.response.status = 201
        return ap

Data validation is done using the voluptuous library. Check out a very good blog post about voluptuous by Julien Dajou here.

Voluptuous basically validates the required fields, and their data type. All of this is defined in a schema (voluptuous.Schema). Voluptuous data types are actually just functions that are called with one argument: the value, and that should either return the value or raise an Invalid or ValueError exception.



Comments 32   Comments

comments powered by Disqus