A deep co-attentive hand-based video question answering framework using multi-view skeleton | Publicación