# MuData nuances#

This is the sharp bits page for mudata, which provides information on the nuances when working with MuData objects.

First, install and import mudata and other libraries.

[1]:

! pip install mudata

[2]:

import mudata as md
from mudata import MuData, AnnData

[3]:

import numpy as np
import pandas as pd


Prepare some simple AnnData objects:

[4]:

n, d1, d2, k = 1000, 100, 200, 10

np.random.seed(1)
z = np.random.normal(loc=np.arange(k), scale=np.arange(k)*2, size=(n,k))
w1 = np.random.normal(size=(d1,k))
w2 = np.random.normal(size=(d2,k))

mod1 = AnnData(X=np.dot(z, w1.T))
mod2 = AnnData(X=np.dot(z, w2.T))


## Variable names#

*NB: It is best to keep variable names unique across all the modalities. This will help to avoid ambiguity as well as performance of some functionality such as updating (see below).*

MuData is designed with features (variables) being different in different modalities in mind. Hence their names should be unique and different between modalities. In other words, .var_names are checked for uniqueness across modalities.

This behaviour ensures all the functions are easy to reason about. For instance, if there is a var_name that is present in both modalities, what happens during plotting a joint embedding from .obsm coloured by this var_name is not strictly defined.

Nevertheless, MuData can accommodate modalities with duplicated .var_names. For the typical workflows, we recommend renaming them manually or calling .var_names_make_unique().

[5]:

mdata = MuData({"mod1": mod1, "mod2": mod2})
print(mdata.var_names)
mdata.var_names_make_unique()
print(mdata.var_names)

Index(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
...
'190', '191', '192', '193', '194', '195', '196', '197', '198', '199'],
dtype='object', length=300)
Index(['mod1:0', 'mod1:1', 'mod1:2', 'mod1:3', 'mod1:4', 'mod1:5', 'mod1:6',
'mod1:7', 'mod1:8', 'mod1:9',
...
'mod2:190', 'mod2:191', 'mod2:192', 'mod2:193', 'mod2:194', 'mod2:195',
'mod2:196', 'mod2:197', 'mod2:198', 'mod2:199'],
dtype='object', length=300)

/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/mudata/_core/mudata.py:404: UserWarning: Cannot join columns with the same name because var_names are intersecting.
warnings.warn(
/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/mudata/_core/mudata.py:852: UserWarning: Modality names will be prepended to var_names since there are identical var_names in different modalities.
warnings.warn(


### Variable names in AnnData objects#

In the example above it is worth pointing out that .var_names_make_unique() is an in-place operation, just as the same method is in anndata.

Hence original AnnData objects’ .var_names have also been modified:

[6]:

mdata["mod1"].var_names[:10]

[6]:

Index(['mod1:0', 'mod1:1', 'mod1:2', 'mod1:3', 'mod1:4', 'mod1:5', 'mod1:6',
'mod1:7', 'mod1:8', 'mod1:9'],
dtype='object')


## Update#

*NB: If individual modalities are changed, updating the MuData object containing it might be required.*

Modalities in MuData objects are full-featured AnnData objects. Hence they can be operated individually, and their MuData parent will have to be updated to fetch this information.

### Observations#

Consider the following example: a new column has been added to a modality-specific metadata table:

[7]:

mod1.obs["mod1_profiled"] = True


While mdata includes mod1 as its first modality, it currently does not know about this change:

[8]:

mdata.obs.columns

[8]:

Index([], dtype='object')


.update() method will fetch these updates and propagate them to the global .obs table.

[9]:

mdata.update()
print(mdata.obs.columns)

Index(['mod1:mod1_profiled'], dtype='object')

[9]:

mod1:mod1_profiled
0 True
1 True

As MuData objects are designed with shared observations by default, this annotation is automatically prefixed by the modality that originated this annotation.

### Variables#

On the other hand, for variables, the default consideration is that they are unique to their modalities. This allows us to merge annotations across modalities, when possible.

[10]:

mod1.var["assay"] = "A"
mod2.var["assay"] = "B"

# Will fetch these values
mdata.update()

[11]:

np.random.seed(10)
mdata.var.sample(5)

[11]:

assay
mod1:24 A
mod1:65 A
mod2:13 B
mod2:161 B
mod2:88 B

See how e.g. muon operates with MuData objects and enables access to modality-specific slots beyond just metadata in the tutorials.