Accurate depth estimation from light field images is essential for various applications. Deep learning-based techniques have shown great potential in addressing this problem while still face challenges such as sensitivity to occlusions and difficulties in handling untextured areas. To overcome these limitations, we propose a novel approach that utilizes both local and global features in the cost volume for depth estimation. Specifically, our hybrid cost volume network consists of two complementary sub-modules: a 2D ContextNet for global context information and a matching cost volume for local feature information. We also introduce an occlusion-aware loss that accounts for occlusion areas to improve depth estimation quality. We demonstrate the effectiveness of our approach on the UrbanLF and HCInew datasets, showing significant improvements over existing methods, especially in occluded and untextured regions. Our method disentangles local feature and global semantic information explicitly, reducing the occlusion and untextured area reconstruction error and improving the accuracy of depth estimation.