This is *almost* a duplicate (see my comment above) but I'll give a quick answer before it (likely) gets closed. The first inequality in the OP has a superfluous $\log(n/d)$, which may be removed via chaining, as discussed here: https://cstheory.stackexchange.com/questions/40217/tight-vc-bound-for-agnostic-learning Once that log-factor is removed from the VC bound, it becomes optimal up to constants (i.e., essentially unimprovable). See Section 5 in the Anthony-Bartlett book: http://www.cambridge.org/gb/academic/subjects/computer-science/pattern-recognition-and-machine-learning/neural-network-learning-theoretical-foundations?format=HB&isbn=9780521573535#JEHXi1lP4I6Gl8dg.97 Regarding the optimality of the Rademacher bound, see the discussion and links here concerning Sudakov's inequality:https://cstheory.stackexchange.com/questions/40312/rademacher-complexity-beyond-the-agnostic-setting