I wouldn't want to give ANY weight to the lowest price, when considering the higher prices - the point here is that the lowest price is fairly often not representative of what you'd pay for more than one.
Actually, if you do straight arithmetic mean for first X items, then it is *exactly* representative of the cost you would pay for each one, if you buy exactly X.
Of course, that's only relevant if you are considering buying the item, in a quantity greater than one. The cached mall price mechanism was designed with the purchase of your daily consumables in mind, not valuing your display case..
Valuing items is hard anyway, even with knowledge of how many were actually sold and for how much (koldfront market).
How does this sound, for valuing rarer/more expensive items:
If the 1st item's price is at least 10M, use that - it's almost certainly not a mispricing, and you're unlikely to be buying such items in bulk!
Otherwise, if the 2nd item is at least 1M, use that.
Otherwise, if the 3rd item is 100K, or the 4th item is 10K, use that.
Otherwise, use the 5th item's price as before.
Does anyone see any downside (or potential for abuse) in that?
I don't really like the hardcoded price limits. They make thing like this possible:
1@100 2@100k 2+@1M -> 100k
but
1@100 2@99k 2+@1M -> 1M
This is a pathological corner case, but if one such case exists, I fear that something like that will happen eventually to someone out there.
(arithmetic mean of first five would peg both at ~440k)
Or how about:
If any of the first five items are at least 10M, use the lowest at-least-10M-price of them.
Otherwise, if any of the first five items are at least 1M, use the lowest at-least-1M-price of them.
Otherwise ... 100k ...
Otherwise ... 10k ...
Otherwise use the 5th item's price as before.
(which puts both of the above at 1M)
but if I really wanted to do value analysis from mall search results, I would go over all the results, remove first few prices and last few prices, say, 4, then look at the remaining amounts and pick the row with highest amount_available*price/number_of_row or something like that.
So
1@100
50@1,000
1000@3,000
500@3,100
1@90,000,000
becomes (removal of 4)
0@100 (removed 1)
47@1,000 (removed the remaining 3)
1000@3,000
497@3,100 (removed 3 - from back)
0@90,000,000 (removed 1 - from back)
for ratings of
0*100/1=0
47*1000/2=23500
1000*3000/3=1000000
497*3100/4=385187
0*90000000/5=0
and pick 3,000 (from row 3, with its highest rating of 1M) as the value.
but if there was 2000+ available at 1000meat (instead of just 47) it would pick that.
(this is loosely based on what I do when setting prices by hand)
For anything significantly better, I would need to know how fast the item in question moves (number of items sold per day) and maximal traded price, so I could remove *all* bogus high prices.