As the growing network deployment of Unmanned Aerial Vicheles (UAVs), traffic offloading has been widely used to mitigate UAV's problem of limited bandwidth in communications due to limited battery capacity. Existing work on traffic offloading has focused on reducing the average delay of the system without considering the fairness issue, and assumed that data transmission follows line-of-sight propagation which contradicts with the realistic situations in both urban and suburban areas. Achieving both fairness and system efficiency with non-line-of-sight user-UAV communication requires to solve a complex non-convex optimization problem. This paper proposes an effective algorithm (NAPPO) for joint UAV navigation and user traffic allocation by applying deep reinforcement learning (DRL). Our NAPPO applies DRL to collect user information (position, data rate and traffic demand) and dynamically adjusts the UAV position and traffic allocation ratio to minimize the maximum delay and hence improve the fairness (i.e., variation in delay between users). We show that our proposed approach of minimizing maximum delay is more effective than minimizing average delay for achieving fairness while preserving the total delay at a reasonable level. The results of the simulation experiments show that NAPPO achieves an impressive performance on the maximum delay, i.e., 49. 82% better than the heuristic algorithm and only 0.1536s worse than the optimal solution.