I'm a native US speaker and I think Wikipedia is correct. And I think you are correct about "very" being an example.
There is a lot of blurring and slurring in rapid casual speech.
However, I don't think the "R" disappears completely. Or, rather, you still hear the "eh" sound of "e" and the "ee" sound of "y," so you hear a kind of diphthong "eh-ee" which really sounds a lot like a faint "R". In other words, it's not "very" but it isn't "vay," either. It's "veh-yee."
Native speakers aren't aware of these things. We _think_ we are saying the "R" in "very," and we _think_ we are HEARING the "R" when we hear it. We don't process any rule.
Connected speech, the flowing sound of normal talk, is very different from the sounds of unconnected phonemes.
One proof of this can be found in Internet transcriptions of song lyrics! Many of these seem to be transcriptions made by people listening to the song. The same song may be transcribed with different words, because NATIVE speakers aren't sure of exactly which words they are hearing.