Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow unixtime-style timestamp extraction #30

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,12 @@ Features
(?P<second>\d{2})?
)?

If your files are for example suffixed with UNIX timestamps, you can specify a
regular expression exposing a named capture group ``unixtime`` like this::

# Use UNIX timestamps
(?P<unixtime>\d+)

**All actions are logged**
Log messages are saved to the system log (e.g. ``/var/log/syslog``) so you
can retrace what happened when something seems to have gone wrong.
Expand Down Expand Up @@ -155,8 +161,9 @@ intended you have no right to complain ;-).
usage of the ``-H``, ``--hourly`` option for details about ``COUNT``."
"``-t``, ``--timestamp-pattern=PATTERN``","Customize the regular expression pattern that is used to match and extract
timestamps from filenames. ``PATTERN`` is expected to be a Python compatible
regular expression that must define the named capture groups 'year',
'month' and 'day' and may define 'hour', 'minute' and 'second'."
regular expression that must define a named capture group 'unixtime' or the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rsommer can you provide better documentations or an example of how to use this new unixtime capture group? I had to read the test case to understand it. There is a section lower in the docs containing some Regex examples, Supported configuration options. That would be a good spot for more elaboration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a short example to the README file, hope that helps - otherwise this seems to be abandoned, since the original pull request was untouched for over a year :(

named capture groups 'year', 'month' and 'day' and may define 'hour',
'minute' and 'second'."
"``-I``, ``--include=PATTERN``","Only process backups that match the shell pattern given by ``PATTERN``. This
argument can be repeated. Make sure to quote ``PATTERN`` so the shell doesn't
expand the pattern before it's received by rotate-backups."
Expand Down
59 changes: 45 additions & 14 deletions rotate_backups/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -447,6 +447,16 @@ def strict(self):
"""
return True

@mutable_property
def _is_unixtime(self):
"""
Is the given pattern used to extract a unix timestamp?

This private property reflects if the given regex is used to exctract
a unix timestamp from file- or directorynames.
"""
return False

@mutable_property
def timestamp_pattern(self):
"""
Expand All @@ -458,8 +468,9 @@ def timestamp_pattern(self):
:func:`re.compile()` documentation for details).

The regular expression pattern is expected to be a Python compatible
regular expression that defines the named capture groups 'year',
'month' and 'day' and optionally 'hour', 'minute' and 'second'.
regular expression that defines the named capture group 'unixtime' or
the named capture groups 'year', 'month' and 'day' and optionally
'hour', 'minute' and 'second'.

String values are automatically coerced to compiled regular expressions
by calling :func:`~humanfriendly.coerce_pattern()`, in this case only
Expand All @@ -476,10 +487,15 @@ def timestamp_pattern(self):
def timestamp_pattern(self, value):
"""Coerce the value of :attr:`timestamp_pattern` to a compiled regular expression."""
pattern = coerce_pattern(value, re.VERBOSE)
for component, required in SUPPORTED_DATE_COMPONENTS:
if component not in pattern.groupindex and required:
raise ValueError("Pattern is missing required capture group! (%s)" % component)
set_property(self, 'timestamp_pattern', pattern)
if "unixtime" in pattern.groupindex:
set_property(self, 'timestamp_pattern', pattern)
self._is_unixtime = True
else:
for component, required in SUPPORTED_DATE_COMPONENTS:
if component not in pattern.groupindex and required:
raise ValueError("Pattern is missing required capture group! (%s)" % component)
set_property(self, 'timestamp_pattern', pattern)
self._is_unixtime = False

def rotate_concurrent(self, *locations, **kw):
"""
Expand Down Expand Up @@ -678,15 +694,30 @@ def match_to_datetime(self, match):
"""
kw = {}
captures = match.groupdict()
for component, required in SUPPORTED_DATE_COMPONENTS:
value = captures.get(component)
if value:
kw[component] = int(value, 10)
elif required:
raise ValueError("Missing required date component! (%s)" % component)
if self._is_unixtime:
base = int(match.groupdict().get("unixtime"))
# Try seconds- and milliseconds-precision timestamps.
for value in (base, base / 1000):
try:
timestamp = datetime.datetime.fromtimestamp(value)
break
except ValueError:
timestamp = None
if timestamp is None:
raise ValueError("%r could not be extracted as unix timestamp")
else:
kw[component] = 0
return datetime.datetime(**kw)
logger.verbose("Extracted timestamp %r from %r", timestamp, value)
return timestamp
else:
for component, required in SUPPORTED_DATE_COMPONENTS:
value = captures.get(component)
if value:
kw[component] = int(value, 10)
elif required:
raise ValueError("Missing required date component! (%s)" % component)
else:
kw[component] = 0
return datetime.datetime(**kw)

def group_backups(self, backups):
"""
Expand Down
5 changes: 3 additions & 2 deletions rotate_backups/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,9 @@

Customize the regular expression pattern that is used to match and extract
timestamps from filenames. PATTERN is expected to be a Python compatible
regular expression that must define the named capture groups 'year',
'month' and 'day' and may define 'hour', 'minute' and 'second'.
regular expression that must define the named capture group 'unixtime' or
the named capture groups 'year', 'month' and 'day' and may define 'hour',
'minute' and 'second'.

-I, --include=PATTERN

Expand Down
15 changes: 15 additions & 0 deletions rotate_backups/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,21 @@ def test_argument_validation(self):
returncode, output = run_cli(main, '-n', '/root')
assert returncode != 0

def test_timestamp_dates(self):
"""Make sure filenames with unix timestamps don't cause an exception."""
with TemporaryDirectory(prefix='rotate-backups-', suffix='-test-suite') as root:
file_with_valid_date = os.path.join(root, 'snapshot-1612396800061.tar.gz')
file_with_invalid_date = os.path.join(root, 'snapshot-1807311501019237.tar.gz')
for filename in file_with_valid_date, file_with_invalid_date:
touch(filename)
program = RotateBackups(
rotation_scheme=dict(monthly='always'),
timestamp_pattern=r"-(?P<unixtime>\d+)\.tar\.gz"
)
backups = program.collect_backups(root)
assert len(backups) == 1
assert backups[0].pathname == file_with_valid_date

def test_invalid_dates(self):
"""Make sure filenames with invalid dates don't cause an exception."""
with TemporaryDirectory(prefix='rotate-backups-', suffix='-test-suite') as root:
Expand Down