quick numpy question

I have a numpy array of nested tuples that I’d like to extract only the 0th element from all the tuples,

example


# array =  [ (0,(...)) (1,(...)) (2,(...)) (3,(...)) ]
extract = [e[0] for e in array]
 

My question, is there a more efficient way than creating another list with list comprehension? Can numpy just view the array without using looping as well?

No. This is because at some point or another, any function/method you call will have to look at each element.

So since you’re talking about efficiency, you’re going to get at best O(n) (linear), because you have to read at least n elements.

EDIT: Aha, just thought of print(array[0:n][0]) lol. But this still ‘loops’ and the efficiency will stay the same.

EDIT2: My bad. That doesn’t work.

why nested tuples?
why not 2 dimensional array and numpy’s special slice syntax?

Maybe rethink a better way to how data is stored?

If you store the data in the other direction (transpose) then you can just grab a row which is fast.

Thanks for the 2d array suggestion and yes, I previously tried to make ndarray with asarray but it failed due to my nested data dynamic size and stupidity so I resorted to a 1d array :o

the data structure in question is actually a bunch of tuples nested inside a growing list which is stored inside a dict


# inner tuple always contain 3 floats however the list containing them, is <b>unpredictable</b> as new data is appended
dic_example = { 0 : <b>[</b><i>(21,2,3),(21,3,5),(22,4,6)</i><b>]</b>,
                          1 : <b>[</b><i>(4,5,6)</i><b>]</b>,
                          2 : <b>[</b><i>(...),(...)</i><b>]</b>,
                          3 :<b> [</b>(...),(...),(...),(...),(...)(...),(...),(...),(...),(...)<b>]</b>
                        }

# 1d array attempt
name = ('index','data')
format = ('i4','f2,f2,f2')
dtyp = dict(names=name, formats=format) 
arr = [(key,v) for key,val in dic_example.items() for v in val]
array = numpy.array(arr, dtype=dtyp)  

although very common usage in python, I couldn’t find examples of it in numpy so I appreciate if anyone can show me some example on how to ndarray that?

Your request already implies access on all items of the container - as you want a part af all items “extract […] the 0th element from all the tuples”. There is no way to get such a list without touching all items.

Shadow list

But there are ways to spend the processing time at a different time. E.g. everytime you add an item to the list, you add the 0th item to a 0th-element-list. This way the list is already present when you need it.

Assumption: the 0th element never changes without updating the 0th-element-list.

The “trick” is to spend the the processing time is spend when adding the items so you do not need to spend it when you request the list.

Other solutions

To provide other solutions we need to know what you want to do with that “extract”.

Thanks for the 2d array suggestion and yes, I previously tried to make ndarray with asarray but it failed due to my nested data dynamic size and stupidity so I resorted to a 1d array :o

the data structure in question is actually a bunch of tuples nested inside a growing list which is stored inside a dict

Code:

inner tuple always contain 3 floats however the list containing them, is unpredictable as new data is appended

dic_example = { 0 : [(21,2,3),(21,3,5),(22,4,6)],
1 : [(4,5,6)],
2 : [(…),(…)],
3 : [(…),(…),(…),(…),(…)(…),(…),(…),(…),(…)]
}

1d array attempt

name = (‘index’,‘data’)
format = (‘i4’,‘f2,f2,f2’)
dtyp = dict(names=name, formats=format)
arr = [(key,v) for key,val in dic_example.items() for v in val]
array = numpy.array(arr, dtype=dtyp)
although very common usage in python, I couldn’t find examples of it in numpy so I appreciate if anyone can show me some example on how to ndarray that?

How do you want your dictionary to behave like an n-dimensional array? Do you want to take each of the values of the tuples of the list and store them as [(dict[0][0][0], dict[0][0][1]), (dict[0][0][2], dict[0][1][0])…] for a 2D array, as an example? If you want to generate an ndarray by progressively obtaining the nth index value of each list (not tuple) in the dictionary referred to, you’re going to run into problems because some lists may not have as much tuples as others.

Oh its an 3d list! :smiley:
Only it ain’t fixed size…

Numpy likes fixed size.
So is there a upper bound to the counts?

the extracted list is for binary search the array, previously I was using the sorted list of dictionary keys(int) and bisect to locate the nearest key in the dictionary, so far both method works but I just thought there might be a quick numpy method around for that purpose

@Mirror|rorriM
yeah, I’m not even sure how I can restructure that dict/list/tuple combo…

@VegetableJuiceF
the bounds are unpredictable to even its creator, although we are talking about a very very clueless one right here lol

Hey I just found a simple solution to my problem and successfully created a 2d array by using dtype = object, now I can grab all 0th keys without a python loop


arr = [(key,v) for key,val in dic_example.items() for v in val]
array = numpy.array(arr, dtype=object)
print(array[0:,0])

I feel so dumb right now lol why didn’t I try that before posting here, really?:confused: then again I probably would go on to use the wrong data structure without discussing it here so thanks for everyone’s input, it’s really helpful

You know that this is still a complete loop, don’t you?

As long as you do this rarely (best just once), you should be fine.

if you want the closest x elements and you need to do it often and the elements are static, you should use a mathutils.kdtree.

Doesn’t happen often but Blue might be onto something.

Complexity reminder
O(logn) < O(n) < O(n * logn)

if you are creating the array every time, then you are better off just [B]for looping O(n) it to find the closest elem.
Even sorting is O(n * logn) > O(n) (more expensive).

If you are reusing it over several logic tics, then the initial investment is larger for a tree search structure O(nlogn).
But later every lookup would only take O(logn) lookups.[/B]

I just found another solution just to make myself look even dumber and clueless with numpy, instead of using dtype=object, I could have use my original 1d array and access its field name like a dictionary


array['index']

It still gets the first column despite its only 1d ndim… I have no idea what I’m doing:no:


@Monster
Yes I know it loops, I just wanted to avoid looping another list comprehension in python

@BluePrintRandom
TEDAAAAAAA!!! (translate: IT APPEARS!!)
Yup, can’t go wrong with a tree data, after some more experiment with numpy

@VegetableJuiceF
Thanks for the reminder, these big o are always messing with my head

Glad you were able to find what you were looking for! Thanks for sharing. And no, you’re not dumb, you’re just experimenting - don’t take it out on yourself. Good job!

1 Like