### Which one is better and when

283

views

2

I was coding (a) part of ex1 of Exercise sheet 1.

There is a simple pythonic way to calculate F(x):

```
x = np.log(np.sort(numbers))
y = np.log(np.arange(len(numbers),0,-1) / len(numbers))
pylab.plot(x,y)
pylab.show()
```

However this should not work when we have equal x values. (Right?) Therefore I made the following more non-pythonic solution

```
y=[]
x=[]
numbers.sort()
for i in range(1,len(numbers)):
if(numbers[i]!=numbers[i-1]):
x.append(numbers[i])
y.append(float((len(numbers)-i))/len(numbers))
x=np.log(x);
y=np.log(y);
pylab.plot(x,y)
pylab.show()
```

Which one is better? Do we have to worry for equal values in general or it's more like task-dependent?

Community: CL IB Foundations of Data Science

### 1 Answer

4

You're right, the 2nd one works for non-equal values while the first one doesn't. However, there are a number of ways to 'pythonify' your second implementation:

Alright, that's part 1a done... part 1b!

```
empDistX = []
empDistY = []
numbers = sorted(data) #sorted() is more pythonic, and is valid over all iterables
for i, x in enumerate(numbers): #get the element and index in 1 go
if(x not in empDistX): #this incurs a slight performance hit
#but is more in the python style
empDistX.append(x)
empDistY.append(float(length - i)/ length)
```

I believe we do have to worry about equal values - in general and in this specific case.

Alright, that's part 1a done... part 1b!

Please login to add an answer/comment or follow this question.

x will be the log of sorted numbers: a list which will look like

[small numbers,..,larger ones,..,large numbers]

however x might contain duplicates, e.g.

[smallV1, smallV1, smallV1, .., large numbers]

where smallV1 is some value that's always the same

the y list will be always the same

[1.0, (len(numbers)-1)/len(numbers), (len(numbers)-2)/len(numbers), .. and so on]

so for same x values we will have different F(x) (i.e. y value) which contradicts to general definition of a function if we assume F(x) is a proper function.

P. S.

Sorry for layout of explanation, I'm in a hurry.