Heightmap terrains in jME consist of meshes. The 1990s “voxel terrains” (e.g. Comanche) were no meshes. That is why they ran so fast on these old computers.
Heighmap-based terrain means that you can only have one height-point per geographical position. So that means no cave under a hill (would be three height-points - one for the hill, one for the cave’s ceiling and one for the cave’s floor).
Both the current technology (with meshes) and the 1990s technology (no meshes) use a heightmap. There are terrains that are not heightmap-based (e.g. full voxel terrains like in GSoC 2014 - Voxel Terrain System - #7 by FuzzyMonkey)