NIFTy issueshttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues2016-05-26T11:09:02Zhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/33Add tensor-/outer-dot to d2o2016-05-26T11:09:02ZTheo SteiningerAdd tensor-/outer-dot to d2oTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/28Move obj.bincount and obj.unique into distributor and make them more efficient.2016-05-26T11:09:09ZTheo SteiningerMove obj.bincount and obj.unique into distributor and make them more efficient.For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>For efficiency, use Allreduce instead of allgather in bincount.
Use `fast-summation` in obj.unique <http://materials.jeremybejarano.com/MPIwithPython/collectiveCom.html#fastsum>Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/26Add unit tests for `copy=True/False` functionality2016-05-26T11:09:16ZTheo SteiningerAdd unit tests for `copy=True/False` functionalityTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/25Slow speed of obj.copy()2016-04-06T00:22:22ZTheo SteiningerSlow speed of obj.copy()obj.copy() is slower than obj+0obj.copy() is slower than obj+0Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/24Slow speed of d2o slicing distribibutor during slicing access.2016-04-06T00:56:28ZTheo SteiningerSlow speed of d2o slicing distribibutor during slicing access.obj[::-1]obj[::-1]Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/23d2o initialization from d2o is slow2016-04-06T00:54:57ZTheo Steiningerd2o initialization from d2o is slowThe slicig distributor is slower than it could be for:
a = np.arange(200000)
obj = distributed_data_object(a, distribution_strategy='fftw')
distributed_data_object(obj, distribution_strategy='equal')
This is connected to the _enfold and _defold methods.
The slicig distributor is slower than it could be for:
a = np.arange(200000)
obj = distributed_data_object(a, distribution_strategy='fftw')
distributed_data_object(obj, distribution_strategy='equal')
This is connected to the _enfold and _defold methods.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/19Add function d2o.arange2016-05-26T11:09:22ZTheo SteiningerAdd function d2o.arangeTheo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/15d2o cumsum and flatten rely on certain features of distribution strategy2016-05-26T11:09:31ZTheo Steiningerd2o cumsum and flatten rely on certain features of distribution strategycumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". cumsum and flatten assume: if the shape of the d2o changes through flattening, the distribution strategy was "slicing". Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/14The d2o_librarian will fail when mixing different MPI comms2016-05-26T11:09:36ZTheo SteiningerThe d2o_librarian will fail when mixing different MPI commsEvery local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary.
Con: Involves MPI communication.Every local librarian instance on a node of a MPI cluster just increments its internal counter by one when a new d2o is registered. This gets out of sync, when only a part of the full cluster is covered by a special comm.
?Possible solution: The individual librarians store the id of 'their' d2o and communicate a common id for their dictionary.
Con: Involves MPI communication.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/13Add support for `from array` indexing2016-05-26T11:09:42ZTheo SteiningerAdd support for `from array` indexingWhen building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
[2, 0]])
Currently, this is solved using a hack:
p.apply_scalar_function(lambda z: obj[z])
This functionality could easily be added to the get_data interface.
When building the kdict from pindex and kindex something of the following form must be done (a==kindex, b==pindex):
a = np.arange(16)*2
b = np.array([[3,2],[1,0]])
In [1]: a[b]
Out[1]:
array([[6, 4],
[2, 0]])
Currently, this is solved using a hack:
p.apply_scalar_function(lambda z: obj[z])
This functionality could easily be added to the get_data interface.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/10Semi-advanced indexing is not recognized2016-05-26T11:09:50ZTheo SteiningerSemi-advanced indexing is not recognized a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore:
obj[(2,1,1),1] -> AttributeError
However,
obj[np.array((2,1,1)), 1]
works.
Solution: Parse the elements and in doubt cast them to numpy arrays.
a = np.arange(24).reshape((3, 4,2))
obj = distributed_data_object(a)
Semi-advanced indexing
a[(2,1,1),1]
yields
array([[18, 19],
[10, 11],
[10, 11]])
The ``indexinglist'' scheme in d2o expects either scalars or numpy arrays as tuple elements and therefore:
obj[(2,1,1),1] -> AttributeError
However,
obj[np.array((2,1,1)), 1]
works.
Solution: Parse the elements and in doubt cast them to numpy arrays.
Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/8Profile the d2o.bincount method2016-05-26T11:09:57ZTheo SteiningerProfile the d2o.bincount methodThe d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. The d2o.bincount method scales well with MPI parallelization but compared to single-core np.bincount has a rather big overhead. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/7Move `flatten` method into the distributor.2016-05-26T11:10:06ZTheo SteiningerMove `flatten` method into the distributor.At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. At the moment `flatten` is performed by the distributed_data_object itself. Thereby it assumes, that flattening the local arrays produces the right result. In general with arbitrary distribution strategies this is wrong. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/6Add `source_rank` parameter to d2o.set_full_data()2016-05-26T11:10:11ZTheo SteiningerAdd `source_rank` parameter to d2o.set_full_data()A source_rank parameter should be added to the distributed_data_object in order to specify on which node the source data-array resides on. A source_rank parameter should be added to the distributed_data_object in order to specify on which node the source data-array resides on. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/5Add `copy` parameter to d2o.get_data()2016-04-06T00:56:53ZTheo SteiningerAdd `copy` parameter to d2o.get_data()A `copy` parameter should be added to d2o.get_data in order to control, whether the resulting d2o should contain a view on or a copy of the old data. A `copy` parameter should be added to d2o.get_data in order to control, whether the resulting d2o should contain a view on or a copy of the old data. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/4Add `axis` keyword functionality to unary methods.2016-05-26T11:10:16ZTheo SteiningerAdd `axis` keyword functionality to unary methods.Many numpy functions support the `axis` keyword in order to perform an operation only along certain directions of the array. The current implementation of d2o does not support this, e.g. for `all`, `any`, `sum`, etc...
Related to: theos/NIFTy#3Many numpy functions support the `axis` keyword in order to perform an operation only along certain directions of the array. The current implementation of d2o does not support this, e.g. for `all`, `any`, `sum`, etc...
Related to: theos/NIFTy#3https://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/3d2o: _contraction_helper does not work when using numpy keyword arguments2017-09-20T21:44:34ZTheo Steiningerd2o: _contraction_helper does not work when using numpy keyword argumentsThe `_contraction_helper` passes keyword arguments to the underlying numpy functions (axis=, keepdims=). The result of the _contraction_helper's local computation is then an array and not a scalar. Therefore the dtype check fails.
Fix: After solving theos/NIFTy#2, adopt to the case that the local run's result object is an array and make a further distinction of cases, i.e for something like axis=0 for the slicing_distributor. The `_contraction_helper` passes keyword arguments to the underlying numpy functions (axis=, keepdims=). The result of the _contraction_helper's local computation is then an array and not a scalar. Therefore the dtype check fails.
Fix: After solving theos/NIFTy#2, adopt to the case that the local run's result object is an array and make a further distinction of cases, i.e for something like axis=0 for the slicing_distributor. Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/2d2o: Contraction functions rely on non-degeneracy of distribution strategy2017-09-20T21:36:14ZTheo Steiningerd2o: Contraction functions rely on non-degeneracy of distribution strategySeveral methods of the distributed_data_object rely on the fact, that the distribution strategy behaves as if the local data was non-degenerate. Currently the non-distributor fixes this by returning trivial (local) results in the _allgather and the _Allreduce_sum method.
Affected d2o methods are at least: _contraction_helper, mean
Fix: Move the functionality for sum, prod, etc... into the distributor.Several methods of the distributed_data_object rely on the fact, that the distribution strategy behaves as if the local data was non-degenerate. Currently the non-distributor fixes this by returning trivial (local) results in the _allgather and the _Allreduce_sum method.
Affected d2o methods are at least: _contraction_helper, mean
Fix: Move the functionality for sum, prod, etc... into the distributor.Theo SteiningerTheo Steiningerhttps://gitlab.mpcdf.mpg.de/ift/nifty/-/issues/1Add out-array parameter to numerical d2o operations.2016-05-26T11:10:36ZTheo SteiningerAdd out-array parameter to numerical d2o operations.Numpy supports to specify an out array in order to avoid memory reallocation.
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
# slow:
a = a + b
# fast:
np.add(a,b,out=a)
Numpy supports to specify an out array in order to avoid memory reallocation.
a = np.array([1,2,3,4])
b = np.array([5,6,7,8])
# slow:
a = a + b
# fast:
np.add(a,b,out=a)
Theo SteiningerTheo Steininger