xphyle package

Public API

xphyle module

The main xphyle methods – xopen, popen, and open_.

class xphyle.BufferWrapper(fileobj: typing.Union[str, pathlib.PurePath, typing.IO], buffer: typing.Union[_io.StringIO, _io.BytesIO], compression: typing.Union[bool, str] = False, name: str = None) → None

Bases: xphyle.FileWrapper

Wrapper around a string/bytes buffer.

Parameters:
  • fileobj – The fileobj to wrap (the raw or wrapped buffer).
  • buffer – The raw buffer.
  • compression – Compression type.
getvalue() → typing.Union[bytes, str]

Returns the contents of the buffer.

class xphyle.EventListener(**kwargs: typing.Dict[typing.Any, typing.Any]) → None

Bases: typing.Generic

Base class for listener events that can be registered on a FileLikeWrapper.

Parameters:kwargs – keyword arguments to pass through to execute
execute(wrapper: E, **kwargs) → None

Handle an event. This method must be implemented by subclasses.

Parameters:
  • wrapper – The EventManager on which this event was registered.
  • kwargs – A union of the keyword arguments passed to the constructor and the __call__ method.
class xphyle.EventManager(*args, **kwargs)

Bases: object

Mixin type for classes that allow registering event listners.

register_listener(event: typing.Union[str, xphyle.types.EventType], listener: xphyle.EventListener) → None

Register an event listener.

Parameters:
  • event – Event name (currently, only ‘close’ is recognized)
  • listener – A listener object, which must be callable with a single argument – this file wrapper.
class xphyle.FileLikeWrapper(fileobj: typing.IO, compression: typing.Union[bool, str] = False) → None

Bases: xphyle.EventManager, xphyle.types.FileLikeBase

Base class for wrappers around file-like objects. By default, method calls are forwarded to the file object. Adds the following:

1. A simple event system by which registered listeners can respond to file events. Currently, ‘close’ is the only supported event 2. Wraps file iterators in a progress bar (if configured)

Parameters:
  • fileobj – The file-like object to wrap.
  • compression – Whether the wrapped file is compressed.
close() → None

Close the file, close an open iterator, and fire ‘close’ events to any listeners.

closed
fileno() → int
flush() → None
isatty() → bool
mode
name
peek(size: int = 1) → typing.Union[bytes, str]

Return bytes/characters from the stream without advancing the position. At most one single read on the raw stream is done to satisfy the call.

Parameters:size – The max number of bytes/characters to return.
Returns:At most size bytes/characters. Unlike io.BufferedReader.peek(), will never return more than size bytes/characters.

Notes

If the file uses multi-byte encoding and N characters are desired, it is up to the caller to request size=2N.

read(size: int = -1) → bytes
readable() → bool
readline(size: int = -1) → typing.Union[bytes, str]
readlines(hint: int = -1) → typing.List[typing.Union[bytes, str]]
seek(offset, whence: int = 0) → int
seekable() → bool
tell() → int
truncate(size: int = None) → int
writable() → bool
write(string: typing.Union[bytes, str]) → int
writelines(lines: typing.Iterable[typing.Union[bytes, str]]) → None
class xphyle.FileWrapper(source: typing.Union[str, pathlib.PurePath, typing.IO], mode: typing.Union[str, xphyle.types.FileMode] = 'w', compression: typing.Union[bool, str] = False, name: str = None, **kwargs) → None

Bases: xphyle.FileLikeWrapper

Wrapper around a file object.

Parameters:
  • source – Path or file object.
  • mode – File open mode.
  • compression – Compression type.
  • name – Use an alternative name for the file.
  • kwargs – Additional arguments to pass to xopen.
name
path

The source path.

class xphyle.Process(args, stdin: typing.Union[str, pathlib.PurePath, typing.IO, int] = None, stdout: typing.Union[str, pathlib.PurePath, typing.IO, int] = None, stderr: typing.Union[str, pathlib.PurePath, typing.IO, int] = None, **kwargs) → None

Bases: xphyle.EventManager, subprocess.Popen, xphyle.types.FileLikeBase, typing.Iterable

Subclass of subprocess.Popen with the following additions:

(e.g. to send compressed data to a process’ stdin or read compressed data from its stdout/stderr). * Provides :method:`Process.close` for properly closing stdin/stdout/stderr streams and terminating the process. * Implements required methods to make objects ‘file-like’.

Parameters:
  • args – Positional arguments, passed to subprocess.Popen constructor.
  • stdout, stderr (stdin,) – Identical to the same arguments to subprocess.Popen.
  • kwargs – Keyword arguments, passed to subprocess.Popen constructor.
check_valid_returncode(valid: typing.Container[int] = (0, None, <Signals.SIGPIPE: 13>, 141))

Check that the returncodes does not have a value associated with an error state.

Raises:
close() → None
close1(timeout: float = None, raise_on_error: bool = False, record_output: bool = False, terminate: bool = False) → int

Close stdin/stdout/stderr streams, wait for process to finish, and return the process return code.

Parameters:
  • timeout – time in seconds to wait for stream to close; negative value or None waits indefinitely.
  • raise_on_error – Whether to raise an exception if the process returns an error.
  • record_output – Whether to store contents of stdout and stderr in place of the actual streams after closing them.
  • terminate – If True and timeout is a positive integer, the process is terminated if it doesn’t finish within timeout seconds.

Notes

If :attribute:`record_output` is True, and if stdout/stderr is a PIPE, any contents are read and stored as the value of :attribute:`stdout`:attribute:stderr. Otherwise the data is lost.

Returns:The process returncode.
Raises:IOError if `raise_on_error` is True and the process returns an – error code.
closed

Whether the Process has been closed.

communicate(inp: typing.Union[bytes, str] = None, timeout: float = None) → typing.Tuple[typing.IO, typing.IO]

Send input to stdin, wait for process to terminate, return results.

Parameters:
  • inp – Input to send to stdin.
  • timeout – Time to wait for process to finish.
Returns:

Tuple of (stdout, stderr).

flush() → None

Flushes stdin if there is one.

get_reader(which: str = None) → typing.IO

Returns the stream for reading data from stdout/stderr.

Parameters:which – Which stream to read from, ‘stdout’ or ‘stderr’. If None, stdout is used if it exists, otherwise stderr.
Returns:The specified stream, or None if the stream doesn’t exist.
get_readers()

Returns (stdout, stderr) tuple.

get_writer() → typing.IO

Returns the stream for writing to stdin.

is_wrapped(name: str) → bool

Returns True if the stream corresponding to name is wrapped.

Parameters:name – One of ‘stdin’, ‘stdout’, ‘stderr’
mode
name
read(size: int = -1, which: str = None) → bytes

Read size bytes/characters from stdout or stderr.

Parameters:
  • size – Number of bytes/characters to read.
  • which – Which stream to read from, ‘stdout’ or ‘stderr’. If None, stdout is used if it exists, otherwise stderr.
Returns:

The bytes/characters read from the specified stream.

readable() → bool

Returns True if this Popen has stdout and/or stderr, otherwise False.

wrap_pipes(**kwargs) → None

Wrap stdin/stdout/stderr PIPE streams using xopen.

Parameters:kwargs – for each of ‘stdin’, ‘stdout’, ‘stderr’, a dict providing arguments to xopen describing how the stream should be wrapped.
writable() → bool

Returns True if this Popen has stdin, otherwise False.

write(data: typing.Union[bytes, str]) → int

Write data to stdin.

Parameters:data – The data to write; must be bytes if stdin is a byte stream or string if stdin is a text stream.
Returns:Number of bytes/characters written
class xphyle.StdWrapper(stream: typing.IO, compression: typing.Union[bool, str] = False) → None

Bases: xphyle.FileLikeWrapper

Wrapper around stdin/stdout/stderr.

Parameters:
  • stream – The stream to wrap.
  • compression – Compression type.
closed
xphyle.configure(default_xopen_context_wrapper: bool = None, progress: bool = None, progress_wrapper: typing.Callable[..., typing.Iterable] = None, system_progress: bool = None, system_progress_wrapper: typing.Sequence[str] = None, threads: int = None, executable_path: typing.Sequence[str] = None) → None

Conifgure xphyle.

Parameters:
  • default_xopen_context_wrapper – Whether to wrap files opened by :method:`xopen` in FileLikeWrapper`s by default (when `xopen‘s context_wrapper parameter is None.
  • progress – Whether to wrap long-running operations with a progress bar
  • progres_wrapper – Specify a non-default progress wrapper
  • system_progress – Whether to use progress bars for system-level
  • system_progress_wrapper – Specify a non-default system progress wrapper
  • threads – The number of threads that can be used by compression formats that support parallel compression/decompression. Set to None or a number < 1 to automatically initalize to the number of cores on the local machine.
  • executable_paths – List of paths where xphyle should look for system executables. These will be searched before the default system path.
xphyle.guess_file_format(path: str) → str

Try to guess the file format, first from the extension, and then from the header bytes.

Parameters:path – The path to the file
Returns:The v format, or None if one could not be determined
xphyle.open_(path_or_file, mode: typing.Union[str, xphyle.types.FileMode] = None, errors: bool = True, wrap_fileobj: bool = True, **kwargs) → typing.Generator[[typing.IO, NoneType], NoneType]

Context manager that frees you from checking if an argument is a path or a file object. Calls xopen to open files.

Parameters:
  • path_or_file – A relative or absolute path, a URL, a system command, a file-like object, or bytes or str to indicate a writeable byte/string buffer.
  • mode – The file open mode.
  • errors – Whether to raise an error if there is a problem opening the file. If False, yields None when there is an error.
  • wrap_fileobj – If path_or_file is a file-likek object, this parameter determines whether it will be passed to xopen for wrapping (True) or returned directly (False). If False, any kwargs are ignored.
  • kwargs – Additional args to pass through to xopen (if f is a path).
Yields:

A file-like object, or None if errors is False and there is a problem opening the file.

Examples

with open_(‘myfile’) as infile:
print(next(infile))

fileobj = open(‘myfile’) with open_(fileobj) as infile:

print(next(infile))
xphyle.popen(args: typing.Iterable, stdin: typing.Union[str, pathlib.PurePath, typing.IO, int, dict, typing.Tuple[typing.Union[str, pathlib.PurePath, typing.IO, int], typing.Union[str, xphyle.types.FileMode, dict]]] = None, stdout: typing.Union[str, pathlib.PurePath, typing.IO, int, dict, typing.Tuple[typing.Union[str, pathlib.PurePath, typing.IO, int], typing.Union[str, xphyle.types.FileMode, dict]]] = None, stderr: typing.Union[str, pathlib.PurePath, typing.IO, int, dict, typing.Tuple[typing.Union[str, pathlib.PurePath, typing.IO, int], typing.Union[str, xphyle.types.FileMode, dict]]] = None, shell: bool = False, **kwargs) → xphyle.Process

Opens a subprocess, using xopen to open input/output streams.

Parameters:
  • args – argument string or tuple of arguments.
  • stdout, stderr (stdin,) – file to use as stdin, PIPE to open a pipe, a dict to pass xopen args for a PIPE, a tuple of (path, mode) or a tuple of (path, dict), where the dict contains parameters to pass to xopen.
  • shell – The ‘shell’ arg from subprocess.Popen.
  • kwargs – additional arguments to subprocess.Popen.
Returns:

A Process object, which is a subclass of subprocess.Popen.

xphyle.xopen(path, mode: typing.Union[str, xphyle.types.FileMode] = None, compression: typing.Union[bool, str] = None, use_system: bool = True, context_wrapper: bool = None, file_type: xphyle.types.FileType = None, validate: bool = True, **kwargs) → typing.IO

Replacement for the builtin open function that can also open URLs and subprocessess, and automatically handles compressed files.

Parameters:
  • path – A relative or absolute path, a URL, a system command, a file-like object, or bytes or str to indicate a writeable byte/string buffer.
  • mode – Some combination of the access mode (‘r’, ‘w’, ‘a’, or ‘x’) and the open mode (‘b’ or ‘t’). If the later is not given, ‘t’ is used by default.
  • compression – If None or True, compression type (if any) will be determined automatically. If False, no attempt will be made to determine compression type. Otherwise this must specify the compression type (e.g. ‘gz’). See xphyle.compression for details. Note that compression will not be guessed for ‘-‘ (stdin).
  • use_system – Whether to attempt to use system-level compression programs.
  • context_wrapper – If True, the file is wrapped in a FileLikeWrapper subclass before returning (FileWrapper for files/URLs, StdWrapper for STDIN/STDOUT/STDERR). If None, the default value (set using :method:`configure`) is used.
  • file_type – a FileType; explicitly specify the file type. By default the file type is detected, but auto-detection might make mistakes, e.g. a local file contains a colon (‘:’) in the name.
  • validate – Ensure that the user-specified compression format matches the format guessed from the file extension or magic bytes.
  • kwargs – Additional keyword arguments to pass to open.
path is interpreted as follows:
  • If starts with ‘|’, it is assumed to be a system command
  • If a file-like object, it is used as-is
  • If one of STDIN, STDOUT, STDERR, the appropriate sys stream is used
  • If parseable by xphyle.urls.parse_url(), it is assumed to be a URL
  • If file_type == FileType.BUFFER and path is a string or bytes and mode is readable, a new StringIO/BytesIO is created with ‘path’ passed to its constructor.
  • Otherwise it is assumed to be a local file

If use_system is True and the file is compressed, the file is opened with a pipe to the system-level compression program (e.g. gzip for ‘.gz’ files) if possible, otherwise the corresponding python library is used.

Returns:

A Process if file_type is PROCESS, or if file_type is None and path starts with ‘|’. Otherwise, an opened file-like object. If context_wrapper is True, this will be a subclass of FileLikeWrapper.

Raises:

ValueError if – * compression is True and compression format cannot be determined * the specified compression format is invalid * validate is True and the specified compression format is not

the acutal format of the file

  • the path or mode are invalid

xphyle.utils module

A collection of convenience methods for reading, writing, and otherwise managing files. All of these functions are ‘safe’, meaning that if you pass errors=False and there is a problem opening the file, the error will be handled gracefully.

class xphyle.utils.CompressOnClose(**kwargs: typing.Dict[typing.Any, typing.Any]) → None

Bases: xphyle.EventListener

Compress a file after it is closed.

compressed_path = None
execute(wrapper: xphyle.FileWrapper, **kwargs) → None
class xphyle.utils.CycleFileOutput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, char_mode: CharMode = None, **kwargs) → None

Bases: xphyle.utils.FileOutput

Alternate each line between files.

Parameters:
  • files – A list of files.
  • char_mode – The character mode.
class xphyle.utils.FileInput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, char_mode: CharMode = None) → None

Bases: xphyle.utils.FileManager, typing.Iterator

Similar to python’s :module:`fileinput` that uses xopen to open files. Currently only supports sequential line-oriented access via next or readline.

Parameters:
  • files – List of files.
  • mode – File open mode.

Notes

Default values are not allowed for generically typed parameters. In a future version, char_mode will default to None and it will be required to specify the mode, or use one of the convenience methods (:method:`textinput` or :method:`byteinput`).

add(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], key: typing.Union[int, str] = None, **kwargs) → None

Overrides FileManager.add() to prevent file-specific open args.

filekey

The key of the file currently being read.

filename

The name of the file currently being read.

finished

Whether all data has been read from all files.

lineno

The total number of lines that have been read so far from all files.

readline() → CharMode

Read the next line from the current file (advancing to the next file if necessary and possible).

Returns:The next line, or the empty string if self.finished==True.
class xphyle.utils.FileManager(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, header=None, **kwargs) → None

Bases: collections.abc.Sized

Dict-like container for files. Files are opened lazily (upon first request) using xopen.

Parameters:
  • files – An iterable of files to add. Each item can either be a string path or a (key, fileobj) tuple.
  • header – A header to write when opening writable files.
  • kwargs – Default arguments to pass to xopen.
add(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], key: typing.Union[int, str] = None, **kwargs) → None

Add a file.

Parameters:
  • path_or_file – Path or file object. If this is a path, the file will be opened with the specified mode.
  • key – Dict key. Defaults to the file name.
  • kwargs – Arguments to pass to xopen. These override any keyword arguments passed to the FileManager’s constructor.
add_all(files: typing.Union[typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]], typing.Dict[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]], **kwargs) → None

Add all files from an iterable or dict.

Parameters:
  • files – An iterable or dict of files to add. If an iterable, each item can either be a string path or a (key, fileobj) tuple.
  • kwargs – Additional arguments to pass to add.
close() → None

Close all files being tracked.

get(key: typing.Union[int, str]) → typing.IO

Get the file object associated with a path. If the file is not already open, it is first opened with xopen.

Parameters:key – The file name/key.
Returns:The opened file.
get_path(key: typing.Union[int, str]) → typing.Union[str, pathlib.PurePath]

Returns the file path associated with a key.

Parameters:key – The key to resolve.
Returns:The file path.
iter_files() → typing.Generator[[typing.Tuple[typing.Any, typing.IO], NoneType], NoneType]

Iterates over all (key, file) pairs in the order they were added.

keys

Returns a list of all keys in the order they were added.

paths

Returns a list of all paths in the order they were added.

class xphyle.utils.FileOutput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, access: typing.Union[str, xphyle.types.ModeAccess] = 'w', char_mode: CharMode = None, linesep: CharMode = None, encoding: str = 'utf-8', header: CharMode = None) → None

Bases: xphyle.utils.FileManager, typing.Generic

Base class for file manager that writes to multiple files.

Parameters:
  • files – The list of files to open.
  • char_mode – The CharMode.
  • access – How to open the output files (‘w’, ‘a’, ‘x’).
  • linesep – The line separator (type must match char_mode).
  • encoding – Default character encoding to use.
  • header – Default file header to write when opening output files.

Notes

Default values for generically typed parameters are not allowed. In a future version, char_mode and linesep will default to None and must be explicitly defined.

write(data: typing.Any, detect_newlines: bool = True) → int

Writes data to the output.

Parameters:
  • data – The data to write; will be converted to string/bytes.
  • detect_newlines – If True, data is split on linesep and the resulting lines are written using :method:`writelines`, otherwise data is writen using :method:`writeline`.
Returns:

The number of characters written.

writeline(line: typing.Union[bytes, str] = None) → typing.Tuple[int, int]

Write a line to the output(s).

Parameters:line – The line to write.
Returns:The tuple (lines_written, chars_written).
writelines(lines: typing.Iterable[typing.Union[bytes, str]]) → typing.Tuple[int, int]

Write an iterable of lines to the output(s).

Parameters:lines – An iterable of lines to write.
Returns:The tuple (lines_written, chars_written).
class xphyle.utils.MoveOnClose(**kwargs: typing.Dict[typing.Any, typing.Any]) → None

Bases: xphyle.EventListener

Move a file after it is closed.

execute(wrapper: xphyle.FileWrapper, dest: typing.Union[str, pathlib.PurePath] = None, **kwargs) → None
class xphyle.utils.NCycleFileOutput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, char_mode: CharMode = None, lines_per_file: int = 1, **kwargs) → None

Bases: xphyle.utils.FileOutput

Alternate output lines between files.

Parameters:
  • files – A list of files.
  • char_mode – The character mode.
  • num_lines – How many lines to write to a file before moving on to the next file.
class xphyle.utils.PatternFileOutput(filename_pattern: str = None, char_mode: CharMode = None, token_func: typing.Callable[[typing.Union[bytes, str]], typing.Dict[typing.Union[bytes, str], typing.Any]] = <function PatternFileOutput.<lambda>>, **kwargs) → None

Bases: xphyle.utils.TokenFileOutput

Use a callable to generate filenames based on data in lines.

Parameters:
  • filename_pattern – The pattern of file names to create. Should have a single token (‘{}’ or ‘{0}’) that is replaced with the file index.
  • char_mode – The character mode.
  • token_func – Function to extract token(s) from lines in file. By default this is the identity function, which is almost never what you want.
  • kwargs – Additional args.
class xphyle.utils.RemoveOnClose(**kwargs: typing.Dict[typing.Any, typing.Any]) → None

Bases: xphyle.EventListener

Remove a file after it is closed.

execute(wrapper: xphyle.FileWrapper, **kwargs) → None
class xphyle.utils.RollingFileOutput(filename_pattern: typing.Iterable[str] = None, char_mode: CharMode = None, lines_per_file: int = 1, **kwargs) → None

Bases: xphyle.utils.TokenFileOutput

Write up to num_lines lines to a file before opening the next file. File names are created from a pattern.

Parameters:
  • filename_pattern – The pattern of file names to create. Should have a single token (‘{}’ or ‘{0}’) that is replaced with the file index.
  • char_mode – The character mode.
  • num_lines – The max number of lines to write to each file.
  • kwargs – Additional args.
class xphyle.utils.TeeFileOutput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, access: typing.Union[str, xphyle.types.ModeAccess] = 'w', char_mode: CharMode = None, linesep: CharMode = None, encoding: str = 'utf-8', header: CharMode = None) → None

Bases: xphyle.utils.FileOutput

Write output to mutliple files simultaneously.

class xphyle.utils.TokenFileOutput(filename_pattern: str = None, char_mode: CharMode = None, **kwargs) → None

Bases: xphyle.utils.FileOutput

Generate file names according to a pattern.

Parameters:
  • filename_pattern – The pattern of file names to create. Should have a single token (‘{}’ or ‘{0}’) that is replaced with the file index.
  • char_mode – The character mode.
  • kwargs – Additional args.
xphyle.utils.byteinput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None)

Convenience method that creates a new FileInput in bytes mode.

Parameters:files – The files to open. If None, files passed on the command line are used, or STDIN if there are no command line arguments.
Returns:A FileInput[bytes] instance.
xphyle.utils.byteoutput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, file_output_type: typing.Callable[..., xphyle.utils.FileOutput[bytes]] = xphyle.utils.TeeFileOutput<~CharMode>[bytes], **kwargs) → xphyle.utils.FileOutput[bytes]

Convenience function to create a fileoutput in bytes mode.

Parameters:
  • files – The files to write to.
  • file_output_type – The specific subclass of FileOutput to create.
  • kwargs – additional arguments to pass to the FileOutput constructor.
Returns:

A FileOutput instance.

xphyle.utils.compress_file(source_file: typing.Union[str, pathlib.PurePath, typing.IO], compressed_file: typing.Union[str, pathlib.PurePath, typing.IO] = None, compression: typing.Union[bool, str] = None, keep: bool = True, compresslevel: int = None, use_system: bool = True, **kwargs) → typing.Union[str, pathlib.PurePath]

Compress an existing file, either in-place or to a separate file.

Parameters:
  • source_file – Path or file-like object to compress.
  • compressed_file – The compressed path or file-like object. If None, compression is performed in-place. If True, file name is determined from source_file and the decompressed file is retained.
  • compression – If True, guess compression format from the file name, otherwise the name of any supported compression format.
  • keep – Whether to keep the source file.
  • compresslevel – Compression level.
  • use_system – Whether to try to use system-level compression.
  • kwargs – Additional arguments to pass to the open method when opening the compressed file.
Returns:

The path to the compressed file.

xphyle.utils.decompress_file(compressed_file: typing.Union[str, pathlib.PurePath, typing.IO], dest_file: typing.Union[str, pathlib.PurePath, typing.IO] = None, compression: typing.Union[bool, str] = None, keep: bool = True, use_system: bool = True, **kwargs) → typing.Union[str, pathlib.PurePath]

decompress an existing file, either in-place or to a separate file.

Parameters:
  • compressed_file – Path or file-like object to decompress.
  • dest_file – Path or file-like object for the decompressed file. If None, file will be decompressed in-place. If True, file will be decompressed to a new file (and the compressed file retained) whose name is determined automatically.
  • compression – None or True, to guess compression format from the file name, or the name of any supported compression format.
  • keep – Whether to keep the source file.
  • use_system – Whether to try to use system-level compression
  • kwargs – Additional arguments to pass to the open method when opening the compressed file.
Returns:

The path of the decompressed file.

xphyle.utils.exec_process(*args, inp: typing.Union[bytes, str] = None, timeout: int = None, **kwargs) → xphyle.Process

Shortcut to execute a process, wait for it to terminate, and return the results.

Parameters:
  • args – Positional arguments to popen.
  • inp – String/bytes to write to process input stream.
  • timeout – Time to wait for process to complete.
  • kwargs – Keyword arguments to popen.
Returns:

A terminated Process. The contents of stdout and stderr are recorded in the stdout and stderr attributes.

xphyle.utils.fileinput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, char_mode: CharMode = None) → xphyle.utils.FileInput[CharMode]

Convenience method that creates a new FileInput.

Parameters:
  • files – The files to open. If None, files passed on the command line are used, or STDIN if there are no command line arguments.
  • char_mode – The default read mode (‘t’ for text or b’b’ for binary).
Returns:

A FileInput instance.

Notes

Default values are not allowed for generically typed parameters. Use :method:`textinput` or :method:`byteinput` instead.

xphyle.utils.fileoutput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, char_mode: CharMode = None, linesep: CharMode = None, encoding: str = 'utf-8', file_output_type: typing.Callable[..., xphyle.utils.FileOutput[CharMode]] = xphyle.utils.TeeFileOutput<~CharMode>[~CharMode]<~CharMode>, **kwargs) → xphyle.utils.FileOutput[CharMode]

Convenience function to create a fileoutput.

Parameters:
  • files – The files to write to.
  • char_mode – The write mode (‘t’ or b’b’).
  • linesep – The separator to use when writing lines.
  • encoding – The default file encoding to use.
  • file_output_type – The specific subclass of FileOutput to create.
  • kwargs – additional arguments to pass to the FileOutput constructor.
Returns:

A FileOutput instance.

Notes

Default values are not allowed for generically typed parameters. Use :method:`textoutput` or :method:`byteoutput` instead.

xphyle.utils.linecount(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], linesep: bytes = None, buffer_size: int = 1048576, **kwargs) → int

Fastest pythonic way to count the lines in a file.

Parameters:
  • path_or_file – File object, or path to the file.
  • linesep – Line delimiter, specified as a byte string (e.g. b’n’).
  • bufsize – How many bytes to read at a time (1 Mb by default).
  • kwargs – Additional arguments to pass to the file open method.
Returns:

The number of lines in the file. Blank lines (including the last line in the file) are included.

xphyle.utils.read_bytes(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], chunksize: int = 1024, **kwargs) → typing.Generator[[bytes, NoneType], NoneType]

Iterate over a file in chunks. The mode will always be overridden to ‘rb’.

Parameters:
  • path – Path to the file, or a file-like object.
  • chunksize – Number of bytes to read at a time.
  • kwargs – Additional arguments to pass top :method:`xphyle.open_`.
Yields:

Chunks of the input file as bytes. Each chunk except the last should be of size chunksize.

xphyle.utils.read_delimited(path: typing.Union[str, pathlib.PurePath], sep: str = '\t', header: typing.Union[bool, typing.Sequence[str]] = False, converters: typing.Union[typing.Callable[[str], typing.Any], typing.Iterable[typing.Callable[[str], typing.Any]]] = None, yield_header: bool = True, row_type: typing.Union[str, typing.Callable[[typing.Sequence[str]], typing.Any]] = 'list', **kwargs) → typing.Generator[[typing.Any, NoneType], NoneType]

Iterate over rows in a delimited file.

Parameters:
  • path – Path to the file, or a file-like object.
  • sep – The field delimiter.
  • header – Either True or False to specifiy whether the file has a header, or a sequence of column names.
  • converters – callable, or iterable of callables, to call on each value.
  • yield_header – If header == True, whether the first row yielded should be the header row.
  • row_type – The collection type to return for each row: tuple, list, or dict.
  • kwargs – additional arguments to pass to csv.reader.
Yields:

Rows of the delimited file. If header==True, the first row yielded is the header row, and its type is always a list. Converters are not applied to the header row.

xphyle.utils.read_delimited_as_dict(path: typing.Union[str, pathlib.PurePath], sep: str = '\t', header: typing.Union[bool, typing.Sequence[str]] = False, key: typing.Union[int, typing.Callable[[typing.Sequence[str]], typing.Any]] = 0, **kwargs) → typing.Dict[typing.Any, typing.Any]

Parse rows in a delimited file and add rows to a dict based on a a specified key index or function.

Parameters:
  • path – Path to the file, or a file-like object.
  • sep – Field delimiter.
  • header – If True, read the header from the first line of the file, otherwise a list of column names.
  • key – The column to use as a dict key, or a function to extract the key from the row. If a string value, header must be specified. All values must be unique, or an exception is raised.
  • kwargs – Additional arguments to pass to read_delimited.
Returns:

A dict with as many element as rows in the file.

Raises:

Exception if a duplicte key is generated.

xphyle.utils.read_dict(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], sep: str = '=', convert: typing.Callable[[str], typing.Any] = None, ordered: bool = False, **kwargs) → typing.Dict[str, typing.Any]

Read lines from simple property file (key=value). Comment lines (starting with ‘#’) are ignored.

Parameters:
  • path – Property file, or a list of properties.
  • sep – Key-value delimiter (defaults to ‘=’).
  • convert – Function to call on each value.
  • ordered – Whether to return an OrderedDict.
  • kwargs – Additional arguments to pass top :method:`xphyle.open_.
Returns:

An OrderedDict, if ‘ordered’ is True, otherwise a dict.

xphyle.utils.read_lines(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], convert: typing.Callable[[str], typing.Any] = None, strip_linesep: bool = True, **kwargs) → typing.Generator[[str, NoneType], NoneType]

Iterate over lines in a file.

Parameters:
  • path_or_file – Path to the file, or a file-like object.
  • convert – Function to call on each line in the file.
  • strip_linesep – Whether to strip off trailing line separators.
  • kwargs – Additional arguments to pass to :method:`xphyle.open_`.
Yields:

Lines of a file, with line endings stripped.

xphyle.utils.textinput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None)

Convenience method that creates a new FileInput in text mode.

Parameters:files – The files to open. If None, files passed on the command line are used, or STDIN if there are no command line arguments.
Returns:A FileInput[Text] instance.
xphyle.utils.textoutput(files: typing.Iterable[typing.Union[str, pathlib.PurePath, typing.Tuple[typing.Any, typing.Union[str, pathlib.PurePath, typing.IO]]]] = None, file_output_type: typing.Callable[..., xphyle.utils.FileOutput[str]] = xphyle.utils.TeeFileOutput<~CharMode>[str], **kwargs) → xphyle.utils.FileOutput[str]

Convenience function to create a fileoutput in text mode.

Parameters:
  • files – The files to write to.
  • file_output_type – The specific subclass of FileOutput to create.
  • kwargs – additional arguments to pass to the FileOutput constructor.
Returns:

A FileOutput instance.

xphyle.utils.to_bytes(value: typing.Any, encoding: str = 'utf-8')

Convert an arbitrary value to bytes.

Parameters:
  • x – Some value.
  • encoding – The byte encoding to use.
Returns:

x converted to a string and then encoded as bytes.

xphyle.utils.transcode_file(source_file: typing.Union[str, pathlib.PurePath, typing.IO], dest_file: typing.Union[str, pathlib.PurePath, typing.IO], source_compression: typing.Union[bool, str] = True, dest_compression: typing.Union[bool, str] = True, use_system: bool = True, source_open_args: dict = None, dest_open_args: dict = None) → None

Convert from one file format to another.

Parameters:
  • source_file – The path or file-like object to read from. If a file, it must be opened in mode ‘rb’.
  • dest_file – The path or file-like object to write to. If a file, it must be opened in binary mode.
  • source_compression – The compression type of the source file. If True, guess compression format from the file name, otherwise the name of any supported compression format.
  • dest_compression – The compression type of the dest file. If True, guess compression format from the file name, otherwise the name of any supported compression format.
  • source_open_args – Additional arguments to pass to xopen for the source file.
  • dest_open_args – Additional arguments to pass to xopen for the destination file.
xphyle.utils.write_bytes(iterable: typing.Iterable[bytes], path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], sep: bytes = b'', convert: typing.Callable[[typing.Any], bytes] = <function to_bytes>, **kwargs) → int

Write an iterable of bytes to a file.

Parameters:
  • iterable – An iterable.
  • path – Path to the file, or a file-like object.
  • sep – Separator between items.
  • convert – Function that converts a value to bytes.
  • kwargs – Additional arguments to pass top :method:`xphyle.open_`.
Returns:

Total number of bytes written, or -1 if errors=False and there was a problem opening the file.

xphyle.utils.write_dict(dictobj: typing.Dict[str, typing.Any], path: typing.Union[str, pathlib.PurePath], sep: str = '=', linesep: str = '\n', convert: typing.Callable[[typing.Any], str] = <class 'str'>, **kwargs) → int

Write a dict to a file as name=value lines.

Parameters:
  • dictobj – The dict (or dict-like object).
  • path – Path to the file.
  • sep – The delimiter between key and value (defaults to ‘=’).
  • linesep – The delimiter between values, or os.linesep if None (defaults to ‘n’).
  • convert – Function that converts a value to a string.
Returns:

Total number of bytes written, or -1 if errors=False and there was a problem opening the file.

xphyle.utils.write_lines(iterable: typing.Iterable[str], path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], linesep: str = '\n', convert: typing.Callable[[typing.Any], str] = <class 'str'>, **kwargs) → int

Write delimiter-separated strings to a file.

Parameters:
  • iterable – An iterable.
  • path – Path to the file, or a file-like object.
  • linesep – The delimiter to use to separate the strings, or os.linesep if None (defaults to ‘n’).
  • convert – Function that converts a value to a string.
  • kwargs – Additional arguments to pass top :method:`xphyle.open_`.
Returns:

Total number of bytes written, or -1 if errors=False and there was a problem opening the file.

xphyle.paths module

Convenience functions for working with file paths.

class xphyle.paths.DirSpec(*path_vars: xphyle.paths.PathVar, template: str = None, pattern: typing.Union[str, Pattern[~AnyStr]] = None) → None

Bases: xphyle.paths.SpecBase

Spec for the directory part of a path.

default_pattern
default_search_root() → typing.Union[str, pathlib.PurePath]
default_var_name
path_part(path) → str
path_type
xphyle.paths.EXECUTABLE_CACHE = <xphyle.paths.ExecutableCache object>

Singleton instance of ExecutableCache.

class xphyle.paths.ExecutableCache(default_path: typing.Iterable[typing.Union[str, pathlib.PurePath]] = ['/home/docs/checkouts/readthedocs.org/user_builds/xphyle/envs/latest/bin', '/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/home/docs/miniconda2/bin']) → None

Bases: object

Lookup and cache executable paths.

Parameters:default_path – The default executable path
add_search_path(paths: typing.Union[pathlib.PurePath, typing.Iterable[typing.Union[str, pathlib.PurePath]]]) → None

Add directories to the beginning of the executable search path.

Parameters:paths – List of paths, or a string with directories separated by os.pathsep.
get_path(executable: str) → typing.Union[str, pathlib.PurePath]

Get the full path of executable.

Parameters:executable – A executable name.
Returns:The full path of executable, or None if the path cannot be found.
reset_search_path(default_path: typing.Iterable[typing.Union[str, pathlib.PurePath]] = ['/home/docs/checkouts/readthedocs.org/user_builds/xphyle/envs/latest/bin', '/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/home/docs/miniconda2/bin']) → None

Reset the search path to default_path.

Parameters:default_path – The default executable path.
resolve_exe(names: typing.Iterable[str]) → typing.Tuple

Given an iterable of command names, find the first that resolves to an executable.

Parameters:names – An iterable of command names.
Returns:A tuple (path, name) of the first command to resolve, or None if none of the commands resolve.
class xphyle.paths.FileSpec(*path_vars: xphyle.paths.PathVar, template: str = None, pattern: typing.Union[str, Pattern[~AnyStr]] = None) → None

Bases: xphyle.paths.SpecBase

Spec for the filename part of a path.

Examples

spec = FileSpec(
PathVar(‘id’, pattern=’[A-Z0-9_]+’), PathVar(‘ext’, pattern=’[^.]+’), template=’{id}.{ext}’)

# get a single file path = spec(id=’ABC123’, ext=’txt’) # => PathInst(‘ABC123.txt’) print(path[‘id’]) # => ‘ABC123’

# get the variable values for a path path = spec.parse(‘ABC123.txt’) print(path[‘id’]) # => ‘ABC123’

# find all files that match a FileSpec in the user’s home directory all_paths = spec.find(‘~’) # => [PathInst...]

default_pattern
default_var_name
path_part(path) → str
path_type
class xphyle.paths.PathInst

Bases: pathlib.PosixPath

A path-like that has a slot for variable values.

joinpath(*other: typing.Union[str, pathlib.PurePath]) → xphyle.paths.PathInst

Join two path-like objects, including merging ‘values’ dicts.

values
class xphyle.paths.PathSpec(dir_spec: typing.Union[str, pathlib.PurePath, xphyle.paths.DirSpec], file_spec: typing.Union[str, xphyle.paths.FileSpec]) → None

Bases: object

Specifies a path in terms of a template with named components (“path variables”).

Parameters:
  • dir_spec – A PathLike if the directory is fixed, otherwise a DirSpec.
  • file_spec – A string if the filename is fixed, otherwise a FileSpec.
construct(**kwargs) → xphyle.paths.PathInst

Create a new PathInst from this PathSpec using values in kwargs.

Parameters:kwargs – Specify values for path variables.
Returns:A PathInst
find(root: typing.Union[str, pathlib.PurePath] = None, path_types: typing.Sequence[typing.Union[str, xphyle.types.PathType]] = 'f', recursive: bool = False) → typing.Sequence[xphyle.paths.PathInst]

Find all paths matching this PathSpec. The search starts in ‘root’ if it is not None, otherwise it starts in the deepest fixed directory of this PathSpec’s DirSpec.

Parameters:
  • root – Directory in which to begin the search.
  • path_types – Types to return – files (‘f’), directories (‘d’) or both (‘fd’).
  • recursive – Whether to search recursively.
Returns:

A sequence of PathInst.

parse(path: typing.Union[str, pathlib.PurePath]) → xphyle.paths.PathInst

Extract PathVar values from path and create a new PathInst.

Parameters:path – The path to parse

Returns: a PathInst

class xphyle.paths.PathVar(name: str, optional: bool = False, default: typing.Any = None, pattern: typing.Union[str, Pattern[~AnyStr]] = None, valid: typing.Iterable[typing.Any] = None, invalid: typing.Iterable[typing.Any] = None) → None

Bases: object

Describes part of a path, used in PathSpec.

Parameters:
  • name – Path variable name
  • optional – Whether this part of the path is optional
  • default – A default value for this path variable
  • pattern – A pattern that the value must match
  • valid – Iterable of valid values
  • invalid – Iterable of invalid values

If valid is specified, invalid and pattern are ignored. Otherwise, values are first checked against pattern (if one is specified), then checked against invalid (if specified).

as_pattern() → str

Format this variable as a regular expression capture group.

xphyle.paths.STDERR = '_'

Placeholder for sys.stderr

xphyle.paths.STDIN = '-'

Placeholder for sys.stdin or sys.stdout (depending on access mode)

xphyle.paths.STDOUT = '-'

Placeholder for sys.stdin or sys.stdout (depending on access mode)

class xphyle.paths.SpecBase(*path_vars: xphyle.paths.PathVar, template: str = None, pattern: typing.Union[str, Pattern[~AnyStr]] = None) → None

Bases: object

Base class for DirSpec and FileSpec.

Parameters:
  • path_vars – Named variables with which to associate parts of a path.
  • template – Format string for creating paths from variables.
  • pattern – Regular expression for identifying matching paths.
construct(**kwargs) → xphyle.paths.PathInst

Create a new PathInst from this spec using values in kwargs.

Parameters:kwargs – Specify values for path variables.
Returns:A PathInst.
default_pattern

The default filename pattern.

default_search_root() → typing.Union[str, pathlib.PurePath]

Get the default root directory for searcing.

default_var_name

The default variable name used for string formatting.

find(root: typing.Union[str, pathlib.PurePath] = None, recursive: bool = False) → typing.Sequence[xphyle.paths.PathInst]

Find all paths in root matching this spec.

Parameters:
  • root – Directory in which to begin the search.
  • recursive – Whether to search recursively.
Returns:

A sequence of PathInst.

parse(path: typing.Union[str, pathlib.PurePath], fullpath: bool = False) → xphyle.paths.PathInst

Extract PathVar values from path and create a new PathInst.

Parameters:path – The path to parse.

Returns: a PathInst.

path_part(path) → str

Return the part of the absolute path corresponding to the spec type.

path_type

The PathType.

class xphyle.paths.TempDir(permissions: typing.Union[xphyle.types.PermissionSet, typing.Sequence[typing.Union[str, int, xphyle.types.Permission, xphyle.types.ModeAccess]]] = 'rwx', path_descriptors: typing.Iterable[xphyle.paths.TempPathDescriptor] = None, **kwargs) → None

Bases: xphyle.paths.TempPath

Context manager that creates a temporary directory and cleans it up upon exit.

Parameters:
  • mode – Access mode to set on temp directory. All subdirectories and files will inherit this mode unless explicity set to be different.
  • path_descriptors – Iterable of TempPathDescriptors.
  • kwargs – Additional arguments passed to tempfile.mkdtemp.

By default all subdirectories and files inherit the mode of the temporary directory. If TempPathDescriptors are specified, the paths are created before permissions are set, enabling creation of a read-only temporary file system.

absolute_path
close() → None

Delete the temporary directory and all files/subdirectories within.

make_directory(desc: xphyle.paths.TempPathDescriptor = None, apply_permissions: bool = True, **kwargs) → typing.Union[str, pathlib.PurePath]

Convenience method; calls make_path with path_type=’d’.

make_empty_files(num_files: int, **kwargs) → typing.Sequence[typing.Union[str, pathlib.PurePath]]

Create randomly-named empty files.

Parameters:
  • n – The number of files to create.
  • kwargs – Arguments to pass to TempPathDescriptor.
Returns:

A sequence of paths.

make_fifo(desc: xphyle.paths.TempPathDescriptor = None, apply_permissions: bool = True, **kwargs) → typing.Union[str, pathlib.PurePath]

Convenience method; calls make_path with path_type=’|’.

make_file(desc: xphyle.paths.TempPathDescriptor = None, apply_permissions: bool = True, **kwargs) → typing.Union[str, pathlib.PurePath]

Convenience method; calls make_path with path_type=’f’.

make_path(desc: xphyle.paths.TempPathDescriptor = None, apply_permissions: bool = True, **kwargs) → typing.Union[str, pathlib.PurePath]

Create a file or directory within the TempDir.

Parameters:
  • desc – A TempPathDescriptor.
  • apply_permissions – Whether permissions should be applied to the new file/directory.
  • kwargs – Arguments to TempPathDescriptor. Ignored unless desc is None.
Returns:

The absolute path to the new file/directory.

make_paths(*path_descriptors: xphyle.paths.TempPathDescriptor) → typing.Sequence[typing.Union[str, pathlib.PurePath]]

Create multiple files/directories at once. The paths are created before permissions are set, enabling creation of a read-only temporary file system.

Parameters:path_descriptors – One or more TempPathDescriptor.
Returns:A list of the created paths.
relative_path
class xphyle.paths.TempPath(parent: typing.Union[xphyle.paths.TempPath, NoneType] = None, permissions: typing.Union[xphyle.types.PermissionSet, typing.Sequence[typing.Union[str, int, xphyle.types.Permission, xphyle.types.ModeAccess]]] = 'rwx', path_type: typing.Union[str, xphyle.types.PathType] = 'd') → None

Bases: object

Base class for temporary files/directories.

Parameters:
  • parent – The parent directory.
  • permissions – The access permissions.
  • path_type – ‘f’ = file, ‘d’ = directory.
absolute_path

The absolute path.

exists

Whether the directory exists.

permissions

The permissions of the path. Defaults to the parent’s mode.

relative_path

The relative path.

set_permissions(permissions: typing.Union[xphyle.types.PermissionSet, typing.Sequence[typing.Union[str, int, xphyle.types.Permission, xphyle.types.ModeAccess]]] = None, set_parent: bool = False, additive: bool = False) → xphyle.types.PermissionSet

Set the permissions for the path.

Parameters:
  • permissions – The new flags to set. If None, the existing flags are used.
  • set_parent – Whether to recursively set the permissions of all parents. This is done additively.
  • additive – Whether permissions should be additive (e.g. if permissions == ‘w’ and self.permissions == ‘r’, the new mode is ‘rw’).
Returns:

The PermissionSet representing the flags that were set.

class xphyle.paths.TempPathDescriptor(name: str = None, parent: xphyle.paths.TempPath = None, permissions: typing.Union[xphyle.types.PermissionSet, typing.Sequence[typing.Union[str, int, xphyle.types.Permission, xphyle.types.ModeAccess]]] = None, suffix: str = '', prefix: str = '', contents: str = '', path_type: typing.Union[str, xphyle.types.PathType] = 'f') → None

Bases: xphyle.paths.TempPath

Describes a temporary file or directory within a TempDir.

Parameters:
  • name – The file/directory name.
  • parent – The parent directory, a TempPathDescriptor.
  • permissions – The permissions mode.
  • prefix (suffix,) – The suffix and prefix to use when calling mkstemp or mkdtemp.
  • path_type – ‘f’ (for file), ‘d’ (for directory), or ‘|’ (for FIFO).
absolute_path

The absolute path.

create(apply_permissions: bool = True) → None

Create the file/directory.

Parameters:apply_permissions – Whether to set permissions according to self.permissions.
relative_path

The relative path.

xphyle.paths.abspath(path: typing.Union[str, pathlib.PurePath]) → typing.Union[str, pathlib.PurePath]

Returns the fully resolved path associated with path.

Parameters:path – Relative or absolute path
Returns:Fully resolved path

Examples

abspath(‘foo’) # -> /path/to/curdir/foo abspath(‘~/foo’) # -> /home/curuser/foo

xphyle.paths.check_access(path: typing.Union[str, pathlib.PurePath], permissions: typing.Union[int, xphyle.types.Permission, xphyle.types.ModeAccess, xphyle.types.PermissionSet, typing.Sequence[typing.Union[str, int, xphyle.types.Permission, xphyle.types.ModeAccess]]]) → xphyle.types.PermissionSet

Check that path is accessible with the given set of permissions.

Parameters:
  • path – The path to check.
  • permissions – Access specifier (string/int/ModeAccess).
Raises:

IOError if the path cannot be accessed according to permissions.

xphyle.paths.check_path(path: typing.Union[str, pathlib.PurePath], path_type: typing.Union[str, xphyle.types.PathType] = None, permissions: typing.Union[int, xphyle.types.Permission, xphyle.types.ModeAccess, xphyle.types.PermissionSet, typing.Sequence[typing.Union[str, int, xphyle.types.Permission, xphyle.types.ModeAccess]]] = None) → typing.Union[str, pathlib.PurePath]

Resolves the path (using resolve_path) and checks that the path is of the specified type and allows the specified access.

Parameters:
  • path – The path to check.
  • path_type – A string or PathType (‘f’ or ‘d’).
  • permissions – Access flag (string, int, Permission, or PermissionSet).
Returns:

The fully resolved path.

Raises:
  • IOError if the path does not exist, is not of the specified type,
  • or doesn’t allow the specified access.
xphyle.paths.check_readable_file(path: typing.Union[str, pathlib.PurePath]) → typing.Union[str, pathlib.PurePath]

Check that path exists and is readable.

Parameters:path – The path to check
Returns:The fully resolved path of path
xphyle.paths.check_writable_file(path: typing.Union[str, pathlib.PurePath], mkdirs: bool = True) → typing.Union[str, pathlib.PurePath]

If path exists, check that it is writable, otherwise check that its parent directory exists and is writable.

Parameters:
  • path – The path to check.
  • mkdirs – Whether to create any missing directories (True).
Returns:

The fully resolved path.

xphyle.paths.filename(path: typing.Union[str, pathlib.PurePath]) → str

Equivalent to split_path(path)[1].

Parameters:path (The) –
Returns:The filename part of path
xphyle.paths.find(root: typing.Union[str, pathlib.PurePath], pattern: typing.Union[str, Pattern[~AnyStr]], path_types: typing.Sequence[typing.Union[str, xphyle.types.PathType]] = 'f', recursive: bool = True, return_matches: bool = False) → typing.Union[typing.Sequence[typing.Union[str, pathlib.PurePath]], typing.Sequence[typing.Tuple[typing.Union[str, pathlib.PurePath], Match[~AnyStr]]]]

Find all paths under root that match pattern.

Parameters:
  • root – Directory at which to start search.
  • pattern – File name pattern to match (string or re object).
  • path_types – Types to return – files (‘f’), directories (‘d’ or both (‘fd’).
  • recursive – Whether to search directories recursively.
  • return_matches – Whether to return regular expression match for each file.
Returns:

List of matching paths. If return_matches is True, each item will be a (path, Match) tuple.

xphyle.paths.get_permissions(path: typing.Union[str, pathlib.PurePath]) → xphyle.types.PermissionSet

Get the permissions of a file/directory.

Parameters:path – Path of file/directory.
Returns:An PermissionSet.
Raises:IOError if the file/directory doesn’t exist.
xphyle.paths.get_root(path: typing.Union[str, pathlib.PurePath] = None) → typing.Union[str, pathlib.PurePath]

Get the root directory.

Parameters:str – A path, or ‘.’ to get the root of the working directory, or None to get the root of the path to the script.
Returns:A path to the root directory.
xphyle.paths.match_to_dict(match: Match[~AnyStr], path_vars: typing.Dict[str, xphyle.paths.PathVar], errors: bool = True) → typing.Dict[str, typing.Any]

Convert a regular expression Match to a dict of (name, value) for all PathVars.

Parameters:
  • match – A re.Match.
  • path_vars – A dict of PathVars.
  • errors – If True, raise an exception on validation error, otherwise return None.
Returns:

A (name, value) dict.

Raises:

ValueError if any values fail validation.

xphyle.paths.path_inst(path: typing.Union[str, pathlib.PurePath], values: dict = None) → xphyle.paths.PathInst

Create a PathInst from a path and values dict.

Parameters:
  • path – The path.
  • values – The values dict.
Returns:

A PathInst.

xphyle.paths.resolve_path(path: typing.Union[str, pathlib.PurePath], parent: typing.Union[str, pathlib.PurePath] = None) → typing.Union[str, pathlib.PurePath]

Resolves the absolute path of the specified file and ensures that the file/directory exists.

Parameters:
  • path – Path to resolve.
  • parent – The directory containing path if path is relative.
Returns:

The absolute path.

Raises:

IOError – if the path does not exist or is invalid.

xphyle.paths.safe_check_path(path: typing.Union[str, pathlib.PurePath], *args, **kwargs) → typing.Union[str, pathlib.PurePath]

Safe vesion of check_path. Returns None rather than throw an exception.

xphyle.paths.safe_check_readable_file(path: typing.Union[str, pathlib.PurePath]) → typing.Union[str, pathlib.PurePath]

Safe vesion of check_readable_file. Returns None rather than throw an exception.

xphyle.paths.safe_check_writable_file(path: typing.Union[str, pathlib.PurePath]) → typing.Union[str, pathlib.PurePath]

Safe vesion of check_writable_file. Returns None rather than throw an exception.

xphyle.paths.set_permissions(path: typing.Union[str, pathlib.PurePath], permissions: typing.Union[xphyle.types.PermissionSet, typing.Sequence[typing.Union[str, int, xphyle.types.Permission, xphyle.types.ModeAccess]]]) → xphyle.types.PermissionSet

Sets file stat flags (using chmod).

Parameters:
  • path – The file to chmod.
  • permissions – Stat flags (any of ‘r’, ‘w’, ‘x’, or an PermissionSet).
Returns:

An PermissionSet.

xphyle.paths.split_path(path: typing.Union[str, pathlib.PurePath], keep_seps: bool = True, resolve: bool = True) → typing.Tuple[str, ...]

Splits a path into a (parent_dir, name, *ext) tuple.

Parameters:
  • path – The path
  • keep_seps – Whether the extension separators should be kept as part of the file extensions
  • resolve – Whether to resolve the path before splitting
Returns:

A tuple of length >= 2, in which the first element is the parent directory, the second element is the file name, and the remaining elements are file extensions.

Examples

split_path(‘myfile.foo.txt’, False) # -> (‘/current/dir’, ‘myfile’, ‘foo’, ‘txt’) split_path(‘/usr/local/foobar.gz’, True) # -> (‘/usr/local’, ‘foobar’, ‘.gz’)

Plugin API

You shouldn’t need these modules unless you want to extend xphyle functionality.

xphyle.formats module

Interfaces to compression file formats. Magic numbers from: https://en.wikipedia.org/wiki/List_of_file_signatures

class xphyle.formats.BGzip

Bases: xphyle.formats.GzipBase

bgzip is block gzip. bgzip files are compatible with gzip. Typically, this format is only used when specifically requested, or when a bgzip file specifically has a .bgz (rather than .gz) extension.

exts
get_command(operation, src='-', stdout=True, compresslevel=None) → typing.List[str]
magic_bytes
mime_types
name
system_commands
class xphyle.formats.BZip2

Bases: xphyle.formats.SingleExeCompressionFormat

Implementation of CompressionFormat for bzip2 files.

compresslevel_range
default_compresslevel
exts
get_command(operation, src='-', stdout=True, compresslevel=6) → typing.List[str]
magic_bytes
mime_types
name
open_file_python(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], mode: typing.Union[str, xphyle.types.FileMode], **kwargs) → typing.IO
system_commands
class xphyle.formats.CompressionFormat

Bases: xphyle.formats.FileFormat

Base class for classes that provide access to system-level and python-level implementations of compression formats.

aliases

All of the aliases by which this format is known.

can_use_system_compression

Whether at least one command in self.system_commands resolves to an existing, executable file.

can_use_system_decompression

Whether at least one command in self.system_commands resolves to an existing, executable file.

compress(raw_bytes: bytes, **kwargs) → bytes

Compress bytes.

Parameters:
  • raw_bytes – The bytes to compress
  • kwargs – Additional arguments to compression function.
Returns:

The compressed bytes

compress_file(source: typing.Union[str, pathlib.PurePath, typing.IO], dest: typing.Union[str, pathlib.PurePath, typing.IO] = None, keep: bool = True, compresslevel: int = None, use_system: bool = True, **kwargs) → str

Compress data from one file and write to another.

Parameters:
  • source – Source file, either a path or an open file-like object.
  • dest – Destination file, either a path or an open file-like object. If None, the file name is determined from source.
  • keep – Whether to keep the source file
  • compresslevel – Compression level
  • use_system – Whether to try to use system-level compression
  • kwargs – Additional arguments to pass to the open method when opening the destination file
Returns:

Path to the destination file.

compress_iterable(strings: typing.Iterable[str], delimiter: bytes = b'', encoding: str = 'utf-8', **kwargs) → bytes

Compress an iterable of strings using the python-level interface.

Parameters:
  • strings – An iterable of strings
  • delimiter – The delimiter (byte string) to use to separate strings
  • encoding – The byte encoding (utf-8)
  • kwargs – Additional arguments to compression function
Returns:

The compressed text, as bytes

compress_name

The name of the compression program.

compress_path

The path of the compression program.

compress_string(text: str, encoding: str = 'utf-8', **kwargs) → bytes

Compress a string.

Parameters:
  • text – The text to compress
  • encoding – The byte encoding (utf-8)
  • kwargs – Additional arguments to compression function
Returns:

The compressed text, as bytes

compresslevel_range

The range of valid compression levels – (lowest, highest).

decompress(compressed_bytes, **kwargs) → bytes

Decompress bytes.

Parameters:
  • compressed_bytes – The compressed data
  • kwargs – Additional arguments to the decompression function
Returns:

The decompressed bytes

decompress_file(source: typing.Union[str, pathlib.PurePath, typing.IO], dest: typing.Union[str, pathlib.PurePath, typing.IO] = None, keep: bool = True, use_system: bool = True, **kwargs) → str

Decompress data from one file and write to another.

Parameters:
  • source – Source file, either a path or an open file-like object.
  • dest – Destination file, either a path or an open file-like object. If None, the file name is determined from source.
  • keep – Whether to keep the source file
  • use_system – Whether to try to use system-level compression
  • kwargs – Additional arguments to passs to the open method when opening the compressed file
Returns:

Path to the destination file

decompress_name

The name of the decompression program.

decompress_path

The path of the decompression program.

decompress_string(compressed_bytes: bytes, encoding: str = 'utf-8', **kwargs) → str

Decompress bytes and return as a string.

Parameters:
  • compressed_bytes – The compressed data
  • encoding – The byte encoding to use
  • kwargs – Additional arguments to the decompression function
Returns:

The decompressed data as a string

default_compresslevel

The default compression level, if compression is supported and is user-configurable, otherwise None.

default_ext

The default file extension for this format.

exts

The valid file extensions.

get_command(operation: str, src: str = '-', stdout: bool = True, compresslevel: int = None) → typing.List[str]

Build the command for the system executable.

Parameters:
  • operation – ‘c’ = compress, ‘d’ = decompress
  • src – The source file path, or STDIN if input should be read from stdin
  • stdout – Whether output should go to stdout
  • compresslevel – Integer compression level; typically 1-9
Returns:

List of command arguments

magic_bytes

The initial bytes that indicate the file type.

mime_types

The MIME types.

name

The canonical format name.

open_file(path: str, mode: typing.Union[str, xphyle.types.FileMode], use_system: bool = True, **kwargs) → typing.IO

Opens a compressed file for reading or writing.

If use_system is True and the system provides an accessible executable, then system-level compression is used. Otherwise defaults to using the python implementation.

Parameters:
  • path – The path of the file to open.
  • mode – The file open mode.
  • use_system – Whether to attempt to use system-level compression.
  • kwargs – Additional arguments to pass to the python-level open method, if system-level compression isn’t used.
Returns:

A file-like object.

open_file_python(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], mode: typing.Union[str, xphyle.types.FileMode], **kwargs) → typing.IO

Open a file using the python library.

Parameters:
  • f – The file to open – a path or open file object.
  • mode – The file open mode.
  • kwargs – Additional arguments to pass to the open method.
Returns:

A file-like object.

system_commands

The names of the system-level commands, in order of preference.

class xphyle.formats.FileFormat

Bases: object

Base class for classes that wrap built-in python file format libraries. The subclass must provide the name member.

lib

Caches and returns the python module assocated with this file format.

Returns:The module
Raises:CompressionError if the module cannot be imported.
class xphyle.formats.Formats

Bases: object

Manages a set of compression formats.

compression_format_aliases = None

Dict mapping aliases to compression format names.

compression_formats = None

Dict of registered compression formats

get_compression_format(name: str) → xphyle.formats.CompressionFormat

Returns the CompressionFormat associated with the given name.

Raises:ValueError if that format is not supported.
get_compression_format_name(alias: str)

Returns the cannonical name for the given alias.

get_format_for_mime_type(mime_type: str) → str

Returns the file format associated with a MIME type, or None if no format is associated with the mime type.

guess_compression_format(name: str) → str

Guess the compression format by name or file extension.

Returns:The format name, or None if it could not be guessed.
guess_format_from_buffer(buffer: _io.BufferedReader) → str

Guess file format from a byte buffer that provides a peek method.

Parameters:buffer – The buffer object
Returns:The format name, or None if it could not be guessed.
guess_format_from_file_header(path: str) → str

Guess file format from ‘magic bytes’ at the beginning of the file.

Note that path must be openable and readable. If it is a named pipe or other pseudo-file type, the magic bytes will be destructively consumed and thus will open correctly.

Parameters:path – Path to the file
Returns:The format name, or None if it could not be guessed.
guess_format_from_header_bytes(header_bytes: bytes) → str

Guess file format from a sequence of bytes from a file header.

Parameters:header_bytes – The bytes
Returns:The format name, or None if it could not be guessed.
list_compression_formats() → typing.Tuple

Returns a list of all registered compression formats.

magic_bytes = None

Dict mapping the first byte in a ‘magic’ sequence to a tuple of (format, rest_of_sequence)

max_magic_bytes = None

Maximum number of bytes in a registered magic byte sequence

mime_types = None

Dict mapping MIME types to file formats

register_compression_format(format_class: typing.Callable[[], xphyle.formats.CompressionFormat]) → None

Register a new compression format.

Parameters:format_class – a subclass of CompressionFormat
class xphyle.formats.Gzip

Bases: xphyle.formats.GzipBase

Implementation of CompressionFormat for gzip files.

compresslevel_range

The compression level; pigz allows 0-11 (har har) while gzip allows 0-9.

default_compresslevel
exts
get_command(operation, src='-', stdout=True, compresslevel=None) → typing.List[str]
magic_bytes
mime_types
name
system_commands
class xphyle.formats.GzipBase

Bases: xphyle.formats.SingleExeCompressionFormat

Base class for gzip and bgzip files.

open_file_python(path_or_file: typing.Union[str, pathlib.PurePath, typing.IO], mode: typing.Union[str, xphyle.types.FileMode], **kwargs) → typing.IO
class xphyle.formats.Lzma

Bases: xphyle.formats.SingleExeCompressionFormat

Implementation of CompressionFormat for lzma (.xz) files.

compress(raw_bytes, **kwargs) → bytes
compresslevel_range
default_compresslevel
exts
get_command(operation, src='-', stdout=True, compresslevel=6) → typing.List[str]
magic_bytes
mime_types
name
system_commands
class xphyle.formats.SingleExeCompressionFormat

Bases: xphyle.formats.CompressionFormat

Base class form ``CompressionFormat``s that use the same executable for compressing and decompressing.

compress_name
compress_path
decompress_name
decompress_path
executable_name

The name of the system executable.

executable_path

The path of the system executable.

class xphyle.formats.SystemIO(path: typing.Union[str, pathlib.PurePath]) → None

Bases: xphyle.types.FileLikeBase

Base class for SystemReader and SystemWriter.

Parameters:name – The file name.
closed
name
class xphyle.formats.SystemReader(executable_path: typing.Union[str, pathlib.PurePath], path: typing.Union[str, pathlib.PurePath], command: typing.List[str], executable_name: str = None) → None

Bases: xphyle.formats.SystemIO

Read from a compressed file using a system-level compression program.

Parameters:
  • executable_path – The fully resolved path the the system executable
  • path – The compressed file to read
  • command – List of command arguments.
  • executable_name – The display name of the executable, or None to use the basename of executable_path
close() → None

Close the reader; terminates the underlying process.

flush() → None

Implementing file interface; no-op.

mode
read(*args) → bytes

Read bytes from the stream. Arguments are passed through to the subprocess read method.

readable() → bool

Implementing file interface; returns True.

class xphyle.formats.SystemWriter(executable_path: typing.Union[str, pathlib.PurePath], path: typing.Union[str, pathlib.PurePath], mode: typing.Union[str, xphyle.types.FileMode] = 'w', command: typing.List[str] = None, executable_name: str = None) → None

Bases: xphyle.formats.SystemIO

Write to a compressed file using a system-level compression program.

Parameters:
  • executable_path – The fully resolved path the the system executable.
  • path – The compressed file to read.
  • mode – The write mode (w/a/x).
  • command – Format string with two variables – exe (the path to the system executable), and path.
  • executable_name – The display name of the executable, or None to use the basename of executable_path.
close() → None

Close the writer; terminates the underlying process.

flush() → None

Flush stdin of the underlying process.

mode
writable() → bool

Implementing file interface; returns True.

write(arg) → int

Write to stdin of the underlying process.

xphyle.formats.THREADS = <xphyle.formats.ThreadsVar object>

Number of concurrent threads that can be used by formats that support parallelization.

class xphyle.formats.ThreadsVar(default_value: int = 1) → None

Bases: object

Maintain threads variable.

update(threads: int = True) → None

Update the number of threads to use.

Parameters:threads – True = use all available cores; False or an int <= 1 means single-threaded; None means reset to the default value; otherwise an integer number of threads.

xphyle.progress module

Common interface to enable operations to be wrapped in a progress bar. By default, tqdm is used for python-level operations and pv for system-level operations.

class xphyle.progress.IterableProgress(default_wrapper: typing.Callable = <class 'xphyle.progress.Tqdm'>) → None

Bases: object

Manages the python-level wrapper.

Parameters:default_wrapper – Callable (typically a class) that returns a Callable with the signature of wrap.
update(enable: bool = None, wrapper: typing.Callable[..., typing.Iterable] = None) → None

Enable the python progress bar and/or set a new wrapper.

Parameters:
  • enable – Whether to enable use of a progress wrapper.
  • wrapper – A callable that takes three arguments, itr, desc, size, and returns an iterable.
wrap(itr: typing.Iterable, desc: str = None, size: int = None) → typing.Iterable

Wrap an iterable in a progress bar.

Parameters:
  • itr – The Iterable to wrap.
  • desc – Optional description.
  • size – Optional max value of the progress bar.
Returns:

The wrapped Iterable.

class xphyle.progress.ProcessProgress(default_wrapper: typing.Callable = <function pv_command>) → None

Bases: object

Manage the system-level progress wrapper.

Parameters:default_wrapper – Callable that returns the argument list for the default wrapper command.
update(enable: bool = None, wrapper: typing.Sequence[str] = None) → None

Enable the python system progress bar and/or set the wrapper command.

Parameters:
  • enable – Whether to enable use of a progress wrapper.
  • wrapper – A command string or sequence of command arguments.
wrap(cmd: typing.Sequence[str], stdin: typing.IO, stdout: typing.IO, **kwargs) → subprocess.Popen

Pipe a system command through a progress bar program.

For the process to be wrapped, one of stdin, stdout must not be None.

Parameters:
  • cmd – Command arguments.
  • stdin – File-like object to read into the process stdin, or None to use PIPE.
  • stdout – File-like object to write from the process stdout, or None to use PIPE.
  • kwargs – Additional arguments to pass to Popen.
Returns:

Open process.

class xphyle.progress.Tqdm

Bases: object

Default python progress bar wrapper.

xphyle.progress.iter_file_chunked(fileobj: typing.IO, chunksize: int = 1024) → typing.Iterable

Returns a progress bar-wrapped iterator over a file that reads fixed-size chunks.

Parameters:
  • fileobj – A file-like object.
  • chunksize – The maximum size in bytes of each chunk.
Returns:

An iterable over the chunks of the file.

xphyle.progress.pv_command(require: bool = False) → typing.Tuple

Default system wrapper command.

xphyle.progress.system_progress_command(exe: typing.Union[str, pathlib.PurePath], *args, require: bool = False) → typing.Tuple

Resolve a system-level progress bar command.

Parameters:
  • exe – The executable name or absolute path.
  • args – A list of additional command line arguments.
  • require – Whether to raise an exception if the command does not exist.
Returns:

A tuple of (executable_path, *args).

xphyle.urls module

Methods for handling URLs.

xphyle.urls.get_url_file_name(response: typing.Any, parsed_url: typing.Tuple[str, str, str, str, str, str] = None) → str

If a response object has HTTP-like headers, extract the filename from the Content-Disposition header.

Parameters:
  • response – A response object returned by open_url.
  • parsed_url – The result of calling parse_url.
Returns:

The file name, or None if it could not be determined.

xphyle.urls.get_url_mime_type(response: typing.Any) → str

If a response object has HTTP-like headers, extract the MIME type from the Content-Type header.

Parameters:response – A response object returned by open_url.
Returns:The content type, or None if the response lacks a ‘Content-Type’ header.
xphyle.urls.open_url(url_string: str, byte_range: typing.Tuple[int, int] = None, headers: dict = None, **kwargs) → typing.Any

Open a URL for reading.

Parameters:
  • url – A valid url string.
  • byte_range – Range of bytes to read (start, stop).
  • headers – dict of request headers.
  • kwargs – Additional arguments to pass to urlopen.
Returns:

A response object, or None if the URL is not valid or cannot be opened.

Notes

The return value of urlopen is only guaranteed to have certain methods, not to be of any specific type, thus the Any return type. Furthermore, the response may be wrapped in an io.BufferedReader to ensure that a peek method is available.

xphyle.urls.parse_url(url_string: str) → typing.Tuple[str, str, str, str, str, str]

Attempts to parse a URL.

Parameters:s – String to test.
Returns:A 6-tuple, as described in urlparse, or None if the URL cannot be parsed, or if it lacks a minimum set of attributes. Note that a URL may be valid and still not be openable (for example, if the scheme is recognized by urlopen).