Welcome to dohlee’s documentation!¶
dohlee package¶
Submodules¶
dohlee.gene module¶
-
dohlee.gene.
get_first_item
(items)[source]¶ Return the first item of items. If ‘items’ is a single value, just return it.
Parameters: items (list/value) – A list of items or a single value. Returns: The first item of items.
dohlee.hg38 module¶
dohlee.plot module¶
-
dohlee.plot.
save
(file, dpi=300, tight_layout=True)[source]¶ Save plot to a file.
Parameters: - file (str) – Path to the resulting image file.
- dpi (int) – (default=300) Resolution.
- tight_layout (bool) – (default=True) Whether to run plt.tight_layout() before saving the plot.
-
dohlee.plot.
set_style
(style='white', palette='deep', context='talk', font='Helvetica Neue', scale=1.0, font_scale=1.0)[source]¶ Set plot preference in a way that looks good to me.
-
dohlee.plot.
get_axis
(figsize=None, dpi=300)[source]¶ Get plot axis with predefined/user-defined width and height.
>>> ax = get_axis() >>> ax = get_axis(figsize=(7.2, 4.45))
Parameters: - scale (float) – Figure size scale. Width and height will be scale with this value.
- figsize (tuple) – Use user-defined width and height. If this is given, scale parameter will be ignored.
-
dohlee.plot.
frequency
(data, order=None, sort_by_values=False, dy=0.01, ax=None, **kwargs)[source]¶ Plot frequency bar chart.
>>> frequency([1, 2, 2, 3, 3, 3], order=[3, 1, 2], sort_by_values=True)
Parameters: - data (list) – A list of elements.
- order (list) – A list of elements which represents the order of the elements to be plotted.
- sort_by_values (bool) – If True, the plot will be sorted in decreasing order of frequency values.
- dy (float) – Gap between a bar and its count label.
- ax (pyplot-axis) – Axis to draw the plot.
-
dohlee.plot.
histogram
(data, ax=None, **kwargs)[source]¶ Draw a histogram.
>>> histogram(data=data, ax=ax, lw=1.55)
Parameters: - data (list) – A list containing values. Density of the values will be drawn as a histogram.
- ax (axis) – Matplotlib axis to draw the plot on.
-
dohlee.plot.
boxplot
(data, x, y, hue=None, ax=None, strip=False, box_kwargs={}, strip_kwargs={})[source]¶ Draw a boxplot.
>>> boxplot(data, x='species', y='sepal_length', strip=True)
Parameters: - data (dataframe) – Dataframe for boxplot.
- x (str) – Column name representing x variable of the plot.
- y (str) – Column name representing y variable of the plot.
- ax (axis) – (Optional) Matplotlib axis to draw the plot on.
- strip (bool) – (default=False) Draw overlapped stripplot.
-
dohlee.plot.
volcano
(data, x, y, padj, label, cutoff=0.05, sample1=None, sample2=None, ax=None)[source]¶ Draw a volcano plot.
>>> volcano(data=data, x='log2FoldChange', y='pvalue', label='Gene_Symbol', cutoff=0.05, padj='padj', figsize=(10.8, 8.4))
Parameters: - data (dataframe) – A dataframe resulting from DEG-discovery tool.
- x (str) – Column name denoting log2 fold change.
- y (str) – Column name denoting p-value. (Note that p-values will be log10-transformed, so they should not be transformed beforehand.)
- padj (str) – Column name denoting adjusted p-value.
- label (str) – Column name denoting gene identifier.
- cutoff (float) – (Optional) Adjusted p-value cutoff value to report significant DEGs.
- sample1 (str) – (Optional) First sample name.
- sample2 (str) – (Optional) Second sample name.
- ax (axis) – (Optional) Matplotlib axis to draw the plot on.
-
dohlee.plot.
pca
(data, labels=None, ax=None, **kwargs)[source]¶ Draw a simple principle component analysis plot of the data.
Parameters: - data (matrix) – Input data. Numpy array recommended.
- labels (list) – (Optional) Corresponding labels to each datum. If specified, data points in the plot will be colored according to the label.
- ax (axis) – (Optional) Matplotlib axis to draw the plot on.
- kwargs – Any other keyword arguments will be passed onto matplotlib.pyplot.scatter.
-
dohlee.plot.
tsne
(data, labels=None, ax=None, **kwargs)[source]¶ Draw a T-SNE analysis plot of the data.
Parameters: - data (matrix) – Input data. Numpy array recommended.
- labels (list) – (Optional) Corresponding labels to each datum. If specified, data points in the plot will be colored according to the label.
- ax (axis) – (Optional) Matplotlib axis to draw the plot on.
- kwargs – Any other keyword arguments will be passed onto matplotlib.pyplot.scatter.
-
dohlee.plot.
coverages
(path, chrom, start, end, strict=False, tick_every=1000, ax=None, **kwargs)[source]¶
-
dohlee.plot.
bisulfite
(path, chrom, start, end, ax=None, tick_every=1000, strict=False, **kwargs)[source]¶
-
dohlee.plot.
stacked_bar_chart
(data, x, y, ax=None, sort=False, reverse=True, sort_by=None, group=None, group_order=None, group_label=True)[source]¶ TODO
-
dohlee.plot.
umap
(data, labels=None, ax=None, **kwargs)[source]¶ Draw a UMAP embedding plot of the data.
Parameters: - data (matrix) – Input data. Numpy array recommended.
- labels (list) – (Optional) Corresponding labels to each datum. If specified, data points in the plot will be colored according to the label.
- ax (axis) – (Optional) Matplotlib axis to draw the plot on.
- kwargs – Any other keyword arguments will be passed onto matplotlib.pyplot.scatter.
dohlee.thread module¶
-
dohlee.thread.
imap_helper
(args)[source]¶ Helper function for imap. This is needed since built-in multiprocessing library does not have istarmap function. If packed arguments are passed, it unpacks the arguments and pass through the function. Otherwise, it just pass the argument through the given function.
- Attributes:
- args: Tuple of two arguments, user-defined function and arguments to pass through.
-
dohlee.thread.
threaded
(func, params, processes, progress=False, progress_type='tqdm')[source]¶ Generate results of the function with given parameters with threads.
- Attributes:
- func (function): Function to be executed. params (iterable): A list of parameters. processes (int): Number of processes to work on. progress (bool): if True, show progress bar. progress_type (str): ‘tqdm’ or ‘tqdm_notebook’ can be used.