Thanks for the update @robertl (I had seen the custom code cacheing bug but didn’t recognise it as such, just assumed I had messed something up!)
Can you clarify the “can’t concat numeric with categorical” point? If I understood correctly, PL data wizard/model creator system automatically one-hot encodes categories - and one-hot is usually numeric 1 or 0, so I don’t see a numeric vs categorical issue per se.
Are you saying that the result of one-hot encoding a categorical variable is a vector of the 1s and 0s, i.e. “rank-1” whereas scalars are 0 dimensional = “rank-0”?
I’ve come across related issues before
I think it’s a conceptual weakness in all such systems: see e.g. wikipedia on scalar where it says
The term scalar is also sometimes used informally to mean a vector, matrix, tensor, or other, usually, “compound” value that is actually reduced to a single component. Thus, for example, the product of a 1 × n matrix and an n × 1 matrix, which is formally a 1 × 1 matrix, is often said to be a scalar .
which is clearly an unhelpful aspect of the way the terms are used
after all, there is one element in a scalar, so if (x1 ,x2, …xn) is an n-D vector there is a good sense in which (x1) on its own is a 1D vector!
Workaround (see update!) would I be correctly in thinking that the simplest workaround (until e.g. there’s an expand_dims option or one does it in code) would be to concat all scalars with the Merge node (necessarily >1 scalar to make a vector/use the merge; use same twice is there is only 1?) to create a vector that can then be concatenated with the one-hot encoded category vector, which are then both rank-1? (Scalar concat works because they are all rank-0, output is rank-1, which the is the same as concatenation of vectors end-end - confirm PL default is to stack “end-end”, which is only sensible if the inputs could be of different length?).
Side question re performance If there are ~many numeric inputs and they are all treated as scalars, does this affect the efficiency with which GPU resources can be used vs automatically creating a single vector of all numeric inputs?
Update Though there’s a preview issue on everything else, all error reports are gone: the 2 stage concatenation seems to work in the model view… but there are other rank issues when I try to run it. I will rebuild from scratch with this new approach and see how it goes. See update 2 for >=1 reason this is not in fact OK!
Update 2 Alas! Woe!
Rebuilt - and viewing in Incognito mode to avoid cache issues - same structure as above, but Merge_2_1 is unhappy. However I now notice that Merge_2 in the 1st image is Addition (please add indicator to Merge component to show Operation type - I’ll make that a feature request) - how does that work when the vectors are different length??
And when I make Merge_2 concatenate it is the same as the new Merge_2_1
So… still experiencing merge challenges