.. last commit introduced a compiler error on missing global variables.
the intention here is to take future's algorithm, embed into a class
and add wrappers for user charts / datafilters.
.. kmeans(centers|assignments, k, dim1, dim2 .. dimn)
perform a k means cluster on data with multiple dimensions
and return the centers, or the assignments.
the return values are ordered so they can be displayed
easily in an overview table e.g.
values {
kmeans(centers, 3, metrics(TSS), metrics(IF));
}
.. will look at how we might plot these in charts with either
color coding of points or perhaps voronoi diagrams.
.. with grateful thanks to Greg Hamerly
A fast kmeans algorithm described here:
https://epubs.siam.org/doi/10.1137/1.9781611972801.12
The source repository is also here:
https://github.com/ghamerly/fast-kmeans
NOTE:
The original source has been included largely as-is with
a view to writing a wrapper around it using Qt semantics
for use in GoldenCheetah (e.g. via datafilter)
The original source included multiple kmeans algorithms
we have only kept the `fast' Hamerly variant.
Only the first 10 examples are reported to avoid anomalies log flooding.
This anomalies can be easily fixed using Fix Speed from Distance tool
with moving average windows set to 1.
.. renamed pdf/cdf to pdfnormal and cdfnormal as they returned
a pdf for a guassian.
.. added pdfbeta(a,b,x) and cdfbeta(a,b,x) for working with
beta distributions.
.. save to .gchart when a user chart is on an overview.
rather annoyingly the scaling is preserved which should
ideally be defaulted on import depending upon context.
we should fix that.
.. Create a new tile on an overview by importing the XML .gchart.
The importer checks the chart is a user chart and also that
it was created for the current view (Analysis vs Trends).
.. Data table and Interval Bubble generate and respond to
interval signals like hover and select.
.. a compromise to help users navigate the data when it
is not possible to clickthru for intervals
.. the data table now accepts a new function i {} which
returns the names of the intervals for each row in
a similar way to f {} for activities.
.. time_to_string is for formatting durations, so it will
use as few characters as possible (e.g 10s, 1:00).
since the interval time is a time of day we want the
full hh:mm:ss format.
.. all matches were being returned, which was not the documented
behaviour, nor generally the desired result
i.e.
match(c(1,2,3), c(1,2,3,1,2,3,1,2,3));
would return
[ 0, 3, 6, 1, 4, 7, 2, 5, 8 ]
but should have returned
[ 0, 1, 2 ]
.. there are likely two things users would like to be able to
control that could be added in the future:
- match all occurences (this commit stops that now)
- return NA or -1 for items that are note found
.. hue goes red-yellow-green-cyan-blue-magenta-purple-red
we only really want the first half of that range for our
heatmap, which effectively makes it red-amber-green with
cyan for very low numbers.
As a palette it will make a lot more sense to the majority
of users.
We may look to add multiple schemes, for example limit to
a single color range or brown/blue etc etc.
.. horrible nested scrolling- when in a data table and there
are multiple rows any wheel event will scroll whilst the
mouse cursor is over the table.
.. we do check that the mouse moved too, so if just scrolling
with the mouse wheel it won't trigger until the mouse
is moved (but most folks aren't that steady on the mouse!).
.. if you assign to a vector using indexes it was only setting
with a single value. But it should be possible to assign
a vector and have it repeat
e.g.
a <- c(1,2,3,4,5,6);
indexes <- c(3,4,5);
a[indexes] <- c(9,10);
# a now contains [ 1, 2, 9, 10, 9, 6 ]
.. also as part of the data table click thru, the highlight
that a row can be clicked to navigate to the ride
should only be shown if that row has a file name.
.. both fixups are related to listing PMC data in an overview
data table and allowing click through for the rows that
have a ride associated, the code looks like this:
f {
# find dates that contain rides
ridedates <- metrics(date);
pmcdates <- pmc(BikeStress,date);
index <- match(ridedates, pmcdates);
# returning all blanks for filenames
# except where there is a ride on that date
returning <- rep("", length(pmcdates));
returning[index] <- filename();
returning;
}
.. tweaking the names from the last couple of commits
* to return heatmap values (between 0 and 1) the
Data Table function "h" is now called "heat".
* the data filter function that does the unity based
normalization is renamed from "heat" to "normalize".
.. did this since normalize() is more accurate and
will be more appropriate when adapting data to
use other algorithms in the future.
.. activities legacy program reinstated and also sets the h {}
function for the activity list.
.. the DataOverviewItem::setDateRange() method now calls h {}
if it is present (forgot in last commit)
.. added a heat(min,max,value) data filter function to convert
values to a heat value between 0 and 1
e.g. heat(0,config(pmax),Average_Power)
.. added Utils::heatcolor(x) method to convert a heat value
from 0-1 to a hue/saturation value color
.. the overview program now has another user definable function
called h {} which returns the heat values. If it is not
present no heat coloring takes place.
.. added h {} to the legacy intervals program, it adds the
h {} function but calling heat() with 0 for min and max
which ultimately makes it do nothing-- crucially the
user can adapt the min and max values to meet their
requirements
.. mostly to make using the activities() function a lot
simpler. A parameter can include a block of code that
should be evaluated as a parameter.
e.g:
activities("isRun", { xx <- metrics(date);
yy <- metrics(Pace); } );
this avoids having to declare a function and call it
just so we can pass as a function parameter.
.. it was always rather dodgy, but caused issues when charts
recreated on config changed (like interacts badly with
the setUpdatesEnabled() call.
.. has a nice effect of stopping the jarring repaints too
which were horrible when themes changed.
Fixes#4029
.. aggmetricstrings() and aggmetrics()
data filter functions that return aggregated values as
opposed to all values for the activities.
.. asaggstrings()
data filter function that returns aggregated values for
the list of metrics provided (primarily used in data
tables).
.. the next commit includes an update to the data table
settings tool to use asaggstrings on trends view.
.. lots of problems related to this, notably:
* UserChart is no longer a GcWindow so doesn't have any
properties registered.
* Even if it was the property was not being registered
by GcWindow or GcChartWindow anyway
* The value was not being initialised so checking for
NULL was kinda pointless (groan)
* OverviewItems looked up the property and never found
it, so crashes were avoided by accident.
.. One interesting point that was revealed during testing
and debugging-- the UserChart program does not honor
any filtering EXCEPT for the activity{ } function, which
although it is not by design, is quite useful.
Fixes#4021
.. returns the powerindex for the given power and duration
which can be vectors.
.. useful to transform meanmax power to strengths and
weakness rating.
.. when moving the scaling slider the charts get updated
immediately, this causes a SEGV as charts are deleted
whilst they are being updated.
.. we now block updates whilst critical processing is
happenning to avoid this.
Fixes#4026