Why Isn’t There a Standard for Typing Array Dimensions in Python?

0
11
Asked By DataWizard99 On

I'm curious about why there's no widely accepted standard for typing array dimensions in Python. As someone working in data science, it would be really helpful to clarify whether a data structure is a vector, matrix, or even a tensor with more dimensions. It's important for functions to indicate if they return outputs of the same size or not. Although I get that typing isn't enforced in Python, I think making function signatures clearer could improve readability significantly. I know libraries like NumPy and SciPy use docstrings for this purpose, but wouldn't it make sense to specify array dimensions directly in function signatures?

5 Answers

Answered By TypeGuru42 On

PEP 646 introduced a feature called `TypeVarTuple`, aimed at addressing this issue. However, I'm not sure how much it's being utilized in libraries or supported by type checkers yet.

Answered By LearningML_89 On

I totally relate to this frustration! When I started with machine learning, I would receive functions that accepted multiple arrays, but they rarely specified their expected shapes. The docstring might say ‘a vector of dimension d,’ but the reality is that sometimes, vectors are treated as 2D in ML. For me, clear type hints for array dimensions would be a game changer, especially since functions often deal with arrays that must meet strict shape dependencies.

Answered By ShapeSeeker On

NumPy does support specifying shape types, though it’s a bit cumbersome. For example, to define a 100x100x3 float64 array, you'd write `np.ndarray[tuple[Literal[100], Literal[100], Literal[3]], np.dtype[np.float64]]`. It hasn't really caught on because it's clunky, and the benefits aren’t substantial enough for widespread adoption.

Answered By TechyTracy On

There's actually jaxtyping, which works with both NumPy and Torch. I've found it really useful for my machine learning projects. It helps clarify dimensions, especially when you’re working with JAX and using functions like vmap.

Answered By CodeSmith33 On

The dimensionality really depends on the value; you can create your own annotations to enforce this. For example, check out how `astropy.units.quantity_input` handles it. Although, this may vary based on how expressive your type system is.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.