hansken.trace
— Interact with traces / search results
Todo
Note
Adding information like a note or a tag to a Trace
allows the caller to force a refresh.
Refreshing a project index is a potentially expensive operation, use this only if the added content is needed immediately.
Trace
itself defines no programmatic way to remove tags, ProjectContext
does:
for tag in trace.tags:
# context can be retrieved from trace, if need be
context.delete_tag(trace.uid, tag)
# note that this loop will only delete the tags in the backend, trace
# is left untouched here
- image_from_uid(trace_uid)[source]
Splits trace_uid into its two parts, image and id, returning the first. Note that a
Trace
object will provide these as propertiesimage_id
andid
.- Parameters:
trace_uid (str) – an image uid
- Returns:
the image UUID from trace_uid
- Return type:
str
- image_from_trace(trace)[source]
Attempts to get an image id from trace, whether trace is a
Trace
object ordict
-like.- Parameters:
trace – a
Trace
ordict
-like trace- Returns:
the image UUID from trace
- Return type:
str
- class Trace[source]
Bases:
AbstractTrace
Base class for traces. Defines convenience methods to navigate or manipulate a trace. Trace data may be accessed using
open
.- ID_SEP = '-'
- property context
The
ProjectContext
instance that created thisTrace
.
- note(note, refresh=None)[source]
Add a note to this
Trace
.- Parameters:
note (str) – the note itself
refresh – if
True
, force a full project refresh, making this note immediately searchable
- property notes
The notes attached to this
Trace
. Note that this does not include notes added bynote
.
- tag(tag, refresh=None)[source]
Tag this trace.
- Parameters:
tag (str) – the tag to set
refresh – if
True
, force a full project refresh, making this note immediately searchable
- property privileged
The privileged state of this trace, either
None
or one ofPrivileged
. Note thatNone
is not a valid value when setting theprivileged
attribute, an operation that requires authorization.
- property creator
The tool that created this
Trace
, orNone
if unknown. Includes the version of that tool, e.g.:toolname 1.2.3
.Note
This value is formatted by
hansken.py
, it is not suitable for use with queries (like finding other traces created by the same tool).
- property tool_versions
The tools and versions that are responsible for this
Trace
’s metadata, as adict
mapping the names of tools to their respective versions. Tool versions typically include the versions of critical software libraries used by those tools.
- property audits
An audit log of user-initiated changes to this
Trace
in the form of a sequence ofdict`s, ordered by the audit's creation timestamp. The audit log can be empty, but never `None
.
- tracelets(tracelet_type, query=None, sort=None)[source]
Provides or retrieves tracelets of type type.
The exact return type of a call to
tracelets
depends on the tracelet type being requested. If the remote defines type to be ‘few’, the result will be alist
ofTracelet
objects. If the remote defines type to be ‘many’, the result will be aSearchResult
ofTracelet
objects. Note that query can only be used with the latter.- Parameters:
tracelet_type – the tracelet type to request
query – query to match tracelets to
sort – ordering of tracelets
- Returns:
a sized iterable of
Tracelet
s (iterable once)
- property children
A
SearchResult
instance containing the child traces of thisTrace
, if any.
- property data_types
A set of data type names available for this
Trace
. These names can be used with calls toopen
or attribute access likeif 'raw' in trace.data_types: # trace has a raw data stream, attribute access to data.raw.size will be safe print('raw data size:', trace.data.raw.size) for data_type in trace.data_types: # format a file name as the trace's name, using the data type name as the extension # (e.g. "some-file.raw" or "another-file.text") out_file = '{}.{}'.format(trace.name, data_type) print('writing first 64 bytes to', out_file) with open(out_file, 'wb') as out_file: # out_file now opened for writing in binary mode # write the first 64 bytes of trace's stream of type data_type to the file out_file.write(trace.open(data_type, size=64).read())
- Returns:
data type names available for this
Trace
(possibly empty, but neverNone
)- Return type:
set
- open(stream='raw', offset=0, size=None, key=<auto-fetch>)[source]
Open a data stream of a named stream (default
raw
) for thisTrace
.Note
Multiple calls to
read(num_bytes)
on the stream resulting from this call works fine in Python 3, but will fail in Python 2.- Parameters:
stream – stream to read
offset – byte offset to start the stream on
size – the number of bytes to make available
key – key for the image of this trace (default is to fetch the key automatically, if it’s available)
- Returns:
a file-like object to read bytes from the named stream
- Return type:
io.BufferedReader
- descriptor(stream='raw', key=<auto-fetch>)[source]
Retrieve the data descriptor for a named stream (default
raw
) for thisTrace
.- Parameters:
stream – stream to get the descriptor for
key – key for the image of this trace (default is to fetch the key automatically, if it’s available)
- Returns:
the stream’s data descriptor (as-is)
- property preview_types
A set of preview type names (mime types) available for this
Trace
. These names can be used with calls topreview
.- Returns:
preview type names available for this
Trace
(possibly empty, but neverNone
)- Return type:
set
- preview(mime_type)[source]
Gets a preview of a particular mime type, e.g. ‘text/plain’ or ‘image/png’.
- Parameters:
mime_type – the preview type to get
- Returns:
bytes
orNone
- snippets(query, num=100, before=200, after=200)[source]
Generate snippets surrounding term hits from query in any of the data streams of this trace.
- Parameters:
query – the query to generate snippets for (should contain term queries, or no snippets will be generated)
num – maximum number of snippets to return
before – number of bytes to include before the term hits
after – number of bytes to include after the term hits
- Returns:
list
ofSnippet
instances
- update(key_or_updates=None, value=None, data=None, overwrite=False)[source]
Requests the remote to update or add metadata properties for this
Trace
.Note
Calls to
update
will not update the source of theTrace
it’s being called on. To get aTrace
instance including the changes made after a successful call toupdate
, usetrace.context.trace(trace.uid)
to request a new instance of a trace with thisTrace
’s identifier.Please note that, for performance reasons, all changes are buffered and not directly effective in subsequent search, update and import requests. As a consequence, successive changes to a single trace might be ignored. Instead, all changes to an individual trace should be bundled in a single update or import request. The project index is refreshed automatically (by default every 30 seconds), so changes will become visible eventually.
- Parameters:
key_or_updates – either a
str
(the metadata property to be updated) or a mapping supplying both keys and values to be updated (orNone
if only data is supplied)value – the value to update metadata property key to (used only when key_or_updates is a
str
)data – a
dict
mapping data type / stream name to bytes to be importedoverwrite – whether properties to be imported should be overwritten if already present
- Returns:
processing information from remote
- child_builder(name=None)[source]
Create a
TraceBuilder
to build a trace to be saved as a child of thisTrace
. Note thatname
is a mandatory property for a trace, even though it is optional here. Aname
can be added later using theTraceBuilder.update
method. Furthermore, a new trace will only be added to the index once explicitly saved (e.g. throughTraceBuilder.build
).- Parameters:
name – the name for the trace being built
- Returns:
a
TraceBuilder
set up to create a child trace of thisTrace
- class Privileged[source]
Bases:
Enum
Possible privileged states of a
Trace
. Values that correspond to ‘not privileged’ (None
orrejected
) are falsy, making them suitable to check whether a trace is privileged.- suspected = 'suspected'
trace is suspected of being privileged
- confirmed = 'confirmed'
trace is confirmed to be privileged
- rejected = 'rejected'
trace is confirmed to be not privileged
- class TraceModel[source]
Bases:
DictView
Utility to deal with intricacies surrounding the trace / data model used by Hansken. Used by
hansken.py
to translate and validate user-specified metadata properties to their corresponding place in the data structure for a trace in Hansken.- property intrinsics
The intrinsic properties (properties that any trace can have, regardless of its type(s)) defined by the trace model.
- is_intrinsic(steps)[source]
Checks whether the property defined by steps is an intrinsic property.
- Parameters:
steps – steps through a
Trace
’ data structure- Returns:
whether the property defined by steps is an intrinsic property
- property origins
The origins defined by the trace model, typically system and user.
- property categories
The categories of types and properties defined by the trace model, e.g. extracted or annotated.
- property types
The trace types defined by the trace model, e.g. file or classification.
- property data_types
Data named data types defined by the trace model for the “data” trace type.
- expand(name)[source]
Expands a trace property to ‘steps’ through a nested data structure.
Inserts a properties category if unspecified, does not include an origin.
- Parameters:
name – the property name to expand, excluding an origin
- Returns:
a
tuple
of ‘steps’- Raises:
ValueError – when a provided name is not defined by the trace model or is missing required parts
- class TraceBuilder[source]
Bases:
DictView
Utility class to aid in creating user-defined traces or updating existing ones. A
TraceBuilder
is a trace model aware view on a nested mapping, using the trace model to both validate requested updates and finding the correct spot for values in the nested mapping.This class is not intended for direct user instantiation, see
- update(key_or_updates, value=None)[source]
Add or overwrite new metadata properties to this builder.
key_or_updates can mix dotted properties and nested structures, all keys and values are merged before applying updates. A
TraceModel
is used to find the proper fully qualified property names if needed, allowing both e.g.update('file.name', 'File Name')
andupdate({'extracted': {'file': {'name': 'file name'}}})
.- Parameters:
key_or_updates – either a
str
(the metadata property to be updated) or a mapping supplying both keys and values to be updated (orNone
if only data is supplied)value – the value to update metadata property key to (used only when key_or_updates is a
str
)
- Returns:
this
TraceBuilder
- add_data(stream, data)[source]
Add data to this trace as a named stream.
- Parameters:
stream – name of the data stream to be added
data – data to be attached
- Returns:
this
TraceBuilder
- property updates
A collection of updates tracked by this
TraceBuilder
.
- property context
The
ProjectContext
instance that created thisTraceBuilder
.
- property target
The combination of (project id, parent trace uid) this
TraceBuilder
applies to.
- child_builder(name=None)[source]
Creates a new
TraceBuilder
to build a child trace to the trace to be represented by this builder.Note
Parent
TraceBuilder`s should be built using the `.build()
call before their child builders as the unique trace identifier (uid) for the parent is needed to build a child trace.- Parameters:
name – name of the new child trace
- Returns:
a
TraceBuilder
set up to save a new trace as the child trace of this builder
- build()[source]
Save the trace being built by this builder to remote.
Note
If this
TraceBuilder
was put in debug mode, the trace is not sent to remote but is instead logged at warning level.- Returns:
the new trace’ uid (or
None
in debug mode)
- class Snippet(source, separator='.')[source]
Bases:
DictView
Snippet result, enabling rendering of a highlighted snippet of text content. Usable as a dictionary where key
'content'
contains a snippet of text and key'highlights'
contains a list of dictionaries encoding highlighted terms in the content.- render(start='[[', end=']]')[source]
Render this snippet by surrounding highlights with start and end marker strings, e.g.:
>>> my_snippet.render() 'A [[snippet]] with the term "[[snippet]]" highlighted.' >>> my_snippet.render(start='<em>', end='</em>') 'A <em>snippet</em> with the term "<em>snippet</em>" highlighted.'
- Parameters:
start – start marker around highlights
end – end marker around highlights
- Returns:
this
Snippet
, highlighted as astr