Browse Source

The identifier completer now reads tags files

See the docs for details. Fixes #135.
Strahinja Val Markovic 12 years ago
parent
commit
454a961318

+ 56 - 2
README.md

@@ -40,8 +40,8 @@ the menu (so you usually need to press TAB just once).
 
 **All of the above works with any programming language** because of the
 identifier-based completion engine. It collects all of the identifiers in the
-current file and other files you visit and searches them when you type
-(identifiers are put into per-filetype groups).
+current file and other files you visit (and your tags files) and searches them
+when you type (identifiers are put into per-filetype groups).
 
 The demo also shows the semantic engine in use. When the user presses `.`, `->`
 or `::` while typing in insert mode (for C++; different triggers are used for
@@ -289,6 +289,26 @@ User Guide
   Shift-TAB binding will not work because the console will not pass it to Vim.
   You can remap the keys; see the _Options_ section below.
 
+Knowing a little bit about how YCM works internally will prevent confusion. YCM
+has several completion engines: an identifier-based completer that collects all
+of the identifiers in the current file and other files you visit (and your tags
+files) and searches them when you type (identifiers are put into per-filetype
+groups).
+
+There are also several semantic engines in YCM. There's a libclang-based
+completer that provides semantic completion for C-family languages.  There's a
+Jedi-based completer for semantic completion for Python. There's also an
+omnifunc-based completer that uses data from Vim's omnicomplete system to
+provide semantic completions when no native completer exists for that language
+in YCM.
+
+There are also other completion engines, like the UltiSnips completer and the
+filepath completer.
+
+YCM automatically detects which completion engine would be the best in any
+situation. On occasion, it queries several of them at once, merges the
+outputs and presents the results to you.
+
 ### Completion string ranking
 
 The subsequence filter removes any completions that do not match the input, but
@@ -712,6 +732,23 @@ Default: `0`
 
     let g:ycm_collect_identifiers_from_comments_and_strings = 0
 
+### The `g:ycm_collect_identifiers_from_tags_files` option
+
+When this option is set to `1`, YCM's identifier completer will also collect
+identifiers from tags files. The list of tags files to examine is retrieved from
+the `tagfiles()` Vim function which examines the `tags` Vim option. See `:h
+'tags'` for details.
+
+YCM will re-index your tags files if it detects that they have been modified.
+
+The only supported tag format is the [Exuberant Ctags format][ctags-format]. The
+format from "plain" ctags is NOT supported. See the _FAQ_ for pointers if YCM
+does not appear to read your tag files.
+
+Default: `1`
+
+    let g:ycm_collect_identifiers_from_tags_files = 1
+
 ### The `g:ycm_add_preview_to_completeopt` option
 
 When this option is set to `1`, YCM will add the `preview` string to Vim's
@@ -1091,6 +1128,21 @@ CompileCommands API) were added after their cut.
 So just go through the installation guide and make sure you are using a correct
 `libclang.so`. I recommend downloading prebuilt binaries from llvm.org.
 
+### YCM does not read identifiers from my tags files
+
+Make sure you are using [Exuberant Ctags][exuberant-ctags] to produce your tags
+files since the only supported tag format is the [Exuberant Ctags
+format][ctags-format]. The format from "plain" ctags is NOT supported. The
+output of `ctags --version` should list "Exuberant Ctags".
+
+NOTE: Mac OS X comes with "plain" ctags installed by default. `brew install
+ctags` will get you the Exuberant Ctags version.
+
+Also make sure that your Vim `tags` option is set correctly. See `:h 'tags'` for
+details. If you want to see which tag files YCM will read for a given buffer,
+run `:echo tagfiles()` with the relevant buffer active. Note that that function
+will only list tag files that already exist.
+
 ### `CTRL-U` in insert mode does not work
 
 YCM keeps you in a `completefunc` completion mode when you're typing in insert
@@ -1174,3 +1226,5 @@ This software is licensed under the [GPL v3 license][gpl].
 [eclim]: http://eclim.org/
 [jedi]: https://github.com/davidhalter/jedi
 [ultisnips]: https://github.com/SirVer/ultisnips/blob/master/doc/UltiSnips.txt
+[exuberant-ctags]: http://ctags.sourceforge.net/
+[ctags-format]: http://ctags.sourceforge.net/FORMAT

+ 33 - 8
cpp/ycm/IdentifierCompleter.cpp

@@ -79,7 +79,7 @@ IdentifierCompleter::IdentifierCompleter()
 IdentifierCompleter::IdentifierCompleter(
   const std::vector< std::string > &candidates )
   : threading_enabled_( false ) {
-  identifier_database_.AddCandidates( candidates, "", "" );
+  identifier_database_.AddIdentifiers( candidates, "", "" );
 }
 
 
@@ -88,7 +88,7 @@ IdentifierCompleter::IdentifierCompleter(
   const std::string &filetype,
   const std::string &filepath )
   : threading_enabled_( false ) {
-  identifier_database_.AddCandidates( candidates, filetype, filepath );
+  identifier_database_.AddIdentifiers( candidates, filetype, filepath );
 }
 
 
@@ -111,17 +111,42 @@ void IdentifierCompleter::EnableThreading() {
 }
 
 
-void IdentifierCompleter::AddCandidatesToDatabase(
+void IdentifierCompleter::AddIdentifiersToDatabase(
     const std::vector< std::string > &new_candidates,
     const std::string &filetype,
     const std::string &filepath ) {
-  identifier_database_.AddCandidates( new_candidates,
+  identifier_database_.AddIdentifiers( new_candidates,
                                       filetype,
                                       filepath );
 }
 
 
-void IdentifierCompleter::AddCandidatesToDatabaseFromBuffer(
+void IdentifierCompleter::AddIdentifiersToDatabaseFromTagFiles(
+    const std::vector< std::string > &absolute_paths_to_tag_files ) {
+  foreach( const std::string &path, absolute_paths_to_tag_files ) {
+    identifier_database_.AddIdentifiers(
+        ExtractIdentifiersFromTagsFile( path ) );
+  }
+}
+
+
+void IdentifierCompleter::AddIdentifiersToDatabaseFromTagFilesAsync(
+    std::vector< std::string > absolute_paths_to_tag_files ) {
+  // TODO: throw exception when threading is not enabled and this is called
+  if ( !threading_enabled_ )
+    return;
+
+  boost::function< void() > functor =
+    boost::bind( &IdentifierCompleter::AddIdentifiersToDatabaseFromTagFiles,
+                 boost::ref( *this ),
+                 boost::move( absolute_paths_to_tag_files ) );
+
+  buffer_identifiers_task_stack_.Push(
+    boost::make_shared< packaged_task< void > >( boost::move( functor ) ) );
+}
+
+
+void IdentifierCompleter::AddIdentifiersToDatabaseFromBuffer(
   const std::string &buffer_contents,
   const std::string &filetype,
   const std::string &filepath,
@@ -133,14 +158,14 @@ void IdentifierCompleter::AddCandidatesToDatabaseFromBuffer(
     buffer_contents :
     RemoveIdentifierFreeText( buffer_contents );
 
-  identifier_database_.AddCandidates(
+  identifier_database_.AddIdentifiers(
       ExtractIdentifiersFromText( new_contents ),
       filetype,
       filepath );
 }
 
 
-void IdentifierCompleter::AddCandidatesToDatabaseFromBufferAsync(
+void IdentifierCompleter::AddIdentifiersToDatabaseFromBufferAsync(
   std::string buffer_contents,
   std::string filetype,
   std::string filepath,
@@ -150,7 +175,7 @@ void IdentifierCompleter::AddCandidatesToDatabaseFromBufferAsync(
     return;
 
   boost::function< void() > functor =
-    boost::bind( &IdentifierCompleter::AddCandidatesToDatabaseFromBuffer,
+    boost::bind( &IdentifierCompleter::AddIdentifiersToDatabaseFromBuffer,
                  boost::ref( *this ),
                  boost::move( buffer_contents ),
                  boost::move( filetype ),

+ 10 - 3
cpp/ycm/IdentifierCompleter.h

@@ -51,12 +51,19 @@ public:
 
   void EnableThreading();
 
-  void AddCandidatesToDatabase(
+  void AddIdentifiersToDatabase(
     const std::vector< std::string > &new_candidates,
     const std::string &filetype,
     const std::string &filepath );
 
-  void AddCandidatesToDatabaseFromBuffer(
+  void AddIdentifiersToDatabaseFromTagFiles(
+    const std::vector< std::string > &absolute_paths_to_tag_files );
+
+  // NOTE: params are taken by value on purpose!
+  void AddIdentifiersToDatabaseFromTagFilesAsync(
+    std::vector< std::string > absolute_paths_to_tag_files );
+
+  void AddIdentifiersToDatabaseFromBuffer(
     const std::string &buffer_contents,
     const std::string &filetype,
     const std::string &filepath,
@@ -65,7 +72,7 @@ public:
   // NOTE: params are taken by value on purpose! With a C++11 compiler we can
   // avoid an expensive copy of buffer_contents if the param is taken by value
   // (move ctors FTW)
-  void AddCandidatesToDatabaseFromBufferAsync(
+  void AddIdentifiersToDatabaseFromBufferAsync(
     std::string buffer_contents,
     std::string filetype,
     std::string filepath,

+ 48 - 24
cpp/ycm/IdentifierDatabase.cpp

@@ -40,28 +40,36 @@ IdentifierDatabase::IdentifierDatabase()
 }
 
 
-void IdentifierDatabase::AddCandidates(
+void IdentifierDatabase::AddIdentifiers(
+    const FiletypeIdentifierMap &filetype_identifier_map ) {
+  boost::lock_guard< boost::mutex > locker( filetype_candidate_map_mutex_ );
+
+  foreach ( const FiletypeIdentifierMap::value_type & filetype_and_map,
+            filetype_identifier_map ) {
+    foreach( const FilepathToIdentifiers::value_type & filepath_and_identifiers,
+             filetype_and_map.second ) {
+      AddIdentifiersNoLock( filepath_and_identifiers.second,
+                            filetype_and_map.first,
+                            filepath_and_identifiers.first );
+    }
+  }
+}
+
+
+void IdentifierDatabase::AddIdentifiers(
   const std::vector< std::string > &new_candidates,
   const std::string &filetype,
   const std::string &filepath ) {
-  boost::lock_guard< boost::mutex > locker( filetype_map_mutex_ );
-  std::list< const Candidate *> &candidates =
-    GetCandidateList( filetype, filepath );
-
-  std::vector< const Candidate * > repository_candidates =
-    candidate_repository_.GetCandidatesForStrings( new_candidates );
-
-  candidates.insert( candidates.end(),
-                     repository_candidates.begin(),
-                     repository_candidates.end() );
+  boost::lock_guard< boost::mutex > locker( filetype_candidate_map_mutex_ );
+  AddIdentifiersNoLock( new_candidates, filetype, filepath );
 }
 
 
 void IdentifierDatabase::ClearCandidatesStoredForFile(
   const std::string &filetype,
   const std::string &filepath ) {
-  boost::lock_guard< boost::mutex > locker( filetype_map_mutex_ );
-  GetCandidateList( filetype, filepath ).clear();
+  boost::lock_guard< boost::mutex > locker( filetype_candidate_map_mutex_ );
+  GetCandidateSet( filetype, filepath ).clear();
 }
 
 
@@ -69,12 +77,12 @@ void IdentifierDatabase::ResultsForQueryAndType(
   const std::string &query,
   const std::string &filetype,
   std::vector< Result > &results ) const {
-  FiletypeMap::const_iterator it;
+  FiletypeCandidateMap::const_iterator it;
   {
-    boost::lock_guard< boost::mutex > locker( filetype_map_mutex_ );
-    it = filetype_map_.find( filetype );
+    boost::lock_guard< boost::mutex > locker( filetype_candidate_map_mutex_ );
+    it = filetype_candidate_map_.find( filetype );
 
-    if ( it == filetype_map_.end() || query.empty() )
+    if ( it == filetype_candidate_map_.end() || query.empty() )
       return;
   }
   Bitset query_bitset = LetterBitsetFromString( query );
@@ -84,7 +92,7 @@ void IdentifierDatabase::ResultsForQueryAndType(
   seen_candidates.reserve( candidate_repository_.NumStoredCandidates() );
 
   {
-    boost::lock_guard< boost::mutex > locker( filetype_map_mutex_ );
+    boost::lock_guard< boost::mutex > locker( filetype_candidate_map_mutex_ );
     foreach ( const FilepathToCandidates::value_type & path_and_candidates,
               *it->second ) {
       foreach ( const Candidate * candidate, *path_and_candidates.second ) {
@@ -109,26 +117,42 @@ void IdentifierDatabase::ResultsForQueryAndType(
 }
 
 
-// WARNING: You need to hold the filetype_map_mutex_ before calling this
-// function and while using the returned list.
-std::list< const Candidate * > &IdentifierDatabase::GetCandidateList(
+// WARNING: You need to hold the filetype_candidate_map_mutex_ before calling
+// this function and while using the returned set.
+std::set< const Candidate * > &IdentifierDatabase::GetCandidateSet(
   const std::string &filetype,
   const std::string &filepath ) {
   boost::shared_ptr< FilepathToCandidates > &path_to_candidates =
-    filetype_map_[ filetype ];
+    filetype_candidate_map_[ filetype ];
 
   if ( !path_to_candidates )
     path_to_candidates.reset( new FilepathToCandidates() );
 
-  boost::shared_ptr< std::list< const Candidate * > > &candidates =
+  boost::shared_ptr< std::set< const Candidate * > > &candidates =
     ( *path_to_candidates )[ filepath ];
 
   if ( !candidates )
-    candidates.reset( new std::list< const Candidate * >() );
+    candidates.reset( new std::set< const Candidate * >() );
 
   return *candidates;
 }
 
+// WARNING: You need to hold the filetype_candidate_map_mutex_ before calling
+// this function and while using the returned set.
+void IdentifierDatabase::AddIdentifiersNoLock(
+  const std::vector< std::string > &new_candidates,
+  const std::string &filetype,
+  const std::string &filepath ) {
+  std::set< const Candidate *> &candidates =
+    GetCandidateSet( filetype, filepath );
+
+  std::vector< const Candidate * > repository_candidates =
+    candidate_repository_.GetCandidatesForStrings( new_candidates );
+
+  candidates.insert( repository_candidates.begin(),
+                     repository_candidates.end() );
+}
+
 
 
 } // namespace YouCompleteMe

+ 25 - 7
cpp/ycm/IdentifierDatabase.h

@@ -23,9 +23,10 @@
 #include <boost/thread/mutex.hpp>
 #include <boost/shared_ptr.hpp>
 
-#include <list>
 #include <vector>
 #include <string>
+#include <map>
+#include <set>
 
 namespace YouCompleteMe {
 
@@ -34,6 +35,15 @@ class Result;
 class CandidateRepository;
 
 
+// filepath -> identifiers
+typedef std::map< std::string, std::vector< std::string > >
+  FilepathToIdentifiers;
+
+// filetype -> (filepath -> identifiers)
+typedef std::map< std::string, FilepathToIdentifiers >
+  FiletypeIdentifierMap;
+
+
 // This class stores the database of identifiers the identifier completer has
 // seen. It stores them in a data structure that makes it easy to tell which
 // identifier came from which file and what files have which filetypes.
@@ -47,7 +57,9 @@ class IdentifierDatabase : boost::noncopyable {
 public:
   IdentifierDatabase();
 
-  void AddCandidates(
+  void AddIdentifiers( const FiletypeIdentifierMap &filetype_identifier_map );
+
+  void AddIdentifiers(
     const std::vector< std::string > &new_candidates,
     const std::string &filetype,
     const std::string &filepath );
@@ -60,24 +72,30 @@ public:
                                std::vector< Result > &results ) const;
 
 private:
-  std::list< const Candidate * > &GetCandidateList(
+  std::set< const Candidate * > &GetCandidateSet(
     const std::string &filetype,
     const std::string &filepath );
 
+  void AddIdentifiersNoLock(
+    const std::vector< std::string > &new_candidates,
+    const std::string &filetype,
+    const std::string &filepath );
+
+
   // filepath -> *( *candidate )
   typedef boost::unordered_map < std::string,
-          boost::shared_ptr< std::list< const Candidate * > > >
+          boost::shared_ptr< std::set< const Candidate * > > >
           FilepathToCandidates;
 
   // filetype -> *( filepath -> *( *candidate ) )
   typedef boost::unordered_map < std::string,
-          boost::shared_ptr< FilepathToCandidates > > FiletypeMap;
+          boost::shared_ptr< FilepathToCandidates > > FiletypeCandidateMap;
 
 
   CandidateRepository &candidate_repository_;
 
-  FiletypeMap filetype_map_;
-  mutable boost::mutex filetype_map_mutex_;
+  FiletypeCandidateMap filetype_candidate_map_;
+  mutable boost::mutex filetype_candidate_map_mutex_;
 };
 
 } // namespace YouCompleteMe

+ 114 - 2
cpp/ycm/IdentifierUtils.cpp

@@ -16,13 +16,20 @@
 // along with YouCompleteMe.  If not, see <http://www.gnu.org/licenses/>.
 
 #include "IdentifierUtils.h"
+#include "Utils.h"
 #include "standard.h"
 
+#include <boost/unordered_map.hpp>
+#include <boost/assign/list_of.hpp>
 #include <boost/regex.hpp>
 #include <boost/algorithm/string/regex.hpp>
 
 namespace YouCompleteMe {
 
+namespace fs = boost::filesystem;
+
+namespace {
+
 const char *COMMENT_AND_STRING_REGEX =
   "//.*?$" // Anything following '//'
   "|"
@@ -40,6 +47,69 @@ const char *COMMENT_AND_STRING_REGEX =
 
 const char *IDENTIFIER_REGEX = "[_a-zA-Z]\\w*";
 
+// For details on the tag format supported, see here for details:
+// http://ctags.sourceforge.net/FORMAT
+// TL;DR: The only supported format is the one Exuberant Ctags emits.
+const char *TAG_REGEX =
+  "^([^\\t\\n\\r]+)"  // The first field is the identifier
+  "\\t"  // A TAB char is the field separator
+  // The second field is the path to the file that has the identifier; either
+  // absolute or relative to the tags file.
+  "([^\\t\\n\\r]+)"
+  "\\t.*?"  // Non-greedy everything
+  "language:([^\\t\\n\\r]+)"  // We want to capture the language of the file
+  ".*?$";
+
+
+// List of languages Exuberant Ctags supports:
+//   ctags --list-languages
+// To map a language name to a filetype, see this file:
+//   :e $VIMRUNTIME/filetype.vim
+const boost::unordered_map< std::string, std::string > LANG_TO_FILETYPE =
+  boost::assign::map_list_of
+      ( std::string( "Ant"        ), std::string( "ant"        ) )
+      ( std::string( "Asm"        ), std::string( "asm"        ) )
+      ( std::string( "Awk"        ), std::string( "awk"        ) )
+      ( std::string( "Basic"      ), std::string( "basic"      ) )
+      ( std::string( "C++"        ), std::string( "cpp"        ) )
+      ( std::string( "C#"         ), std::string( "cs"         ) )
+      ( std::string( "C"          ), std::string( "c"          ) )
+      ( std::string( "COBOL"      ), std::string( "cobol"      ) )
+      ( std::string( "DosBatch"   ), std::string( "dosbatch"   ) )
+      ( std::string( "Eiffel"     ), std::string( "eiffel"     ) )
+      ( std::string( "Erlang"     ), std::string( "erlang"     ) )
+      ( std::string( "Fortran"    ), std::string( "fortran"    ) )
+      ( std::string( "HTML"       ), std::string( "html"       ) )
+      ( std::string( "Java"       ), std::string( "java"       ) )
+      ( std::string( "JavaScript" ), std::string( "javascript" ) )
+      ( std::string( "Lisp"       ), std::string( "lisp"       ) )
+      ( std::string( "Lua"        ), std::string( "lua"        ) )
+      ( std::string( "Make"       ), std::string( "make"       ) )
+      ( std::string( "MatLab"     ), std::string( "matlab"     ) )
+      ( std::string( "OCaml"      ), std::string( "ocaml"      ) )
+      ( std::string( "Pascal"     ), std::string( "pascal"     ) )
+      ( std::string( "Perl"       ), std::string( "perl"       ) )
+      ( std::string( "PHP"        ), std::string( "php"        ) )
+      ( std::string( "Python"     ), std::string( "python"     ) )
+      ( std::string( "REXX"       ), std::string( "rexx"       ) )
+      ( std::string( "Ruby"       ), std::string( "ruby"       ) )
+      ( std::string( "Scheme"     ), std::string( "scheme"     ) )
+      ( std::string( "Sh"         ), std::string( "sh"         ) )
+      ( std::string( "SLang"      ), std::string( "slang"      ) )
+      ( std::string( "SML"        ), std::string( "sml"        ) )
+      ( std::string( "SQL"        ), std::string( "sql"        ) )
+      ( std::string( "Tcl"        ), std::string( "tcl"        ) )
+      ( std::string( "Tex"        ), std::string( "tex"        ) )
+      ( std::string( "Vera"       ), std::string( "vera"       ) )
+      ( std::string( "Verilog"    ), std::string( "verilog"    ) )
+      ( std::string( "VHDL"       ), std::string( "vhdl"       ) )
+      ( std::string( "Vim"        ), std::string( "vim"        ) )
+      ( std::string( "YACC"       ), std::string( "yacc"       ) );
+
+const std::string NOT_FOUND = "YCMFOOBAR_NOT_FOUND";
+
+}  // unnamed namespace
+
 
 std::string RemoveIdentifierFreeText( std::string text ) {
   boost::erase_all_regex( text, boost::regex( COMMENT_AND_STRING_REGEX ) );
@@ -52,8 +122,8 @@ std::vector< std::string > ExtractIdentifiersFromText(
   std::string::const_iterator start = text.begin();
   std::string::const_iterator end   = text.end();
 
-  boost::match_results< std::string::const_iterator > matches;
-  boost::regex expression( IDENTIFIER_REGEX );
+  boost::smatch matches;
+  const boost::regex expression( IDENTIFIER_REGEX );
 
   std::vector< std::string > identifiers;
 
@@ -65,4 +135,46 @@ std::vector< std::string > ExtractIdentifiersFromText(
   return identifiers;
 }
 
+
+FiletypeIdentifierMap ExtractIdentifiersFromTagsFile(
+    const fs::path &path_to_tag_file ) {
+  FiletypeIdentifierMap filetype_identifier_map;
+  std::string tags_file_contents;
+
+  try {
+    tags_file_contents = ReadUtf8File( path_to_tag_file );
+  } catch (...) {
+    return filetype_identifier_map;
+  }
+
+  std::string::const_iterator start = tags_file_contents.begin();
+  std::string::const_iterator end   = tags_file_contents.end();
+
+  boost::smatch matches;
+  const boost::regex expression( TAG_REGEX );
+  const boost::match_flag_type options = boost::match_not_dot_newline;
+
+  while ( boost::regex_search( start, end, matches, expression, options ) ) {
+    start = matches[ 0 ].second;
+
+    std::string language( matches[ 3 ] );
+    std::string filetype = FindWithDefault( LANG_TO_FILETYPE,
+                                            language,
+                                            NOT_FOUND );
+
+    if ( filetype == NOT_FOUND )
+      continue;
+
+    std::string identifier( matches[ 1 ] );
+    fs::path path( matches[ 2 ] );
+
+    if ( path.is_relative() )
+      path = path_to_tag_file.parent_path() / path;
+
+    filetype_identifier_map[ filetype ][ path.string() ].push_back( identifier );
+  }
+
+  return filetype_identifier_map;
+}
+
 } // namespace YouCompleteMe

+ 7 - 0
cpp/ycm/IdentifierUtils.h

@@ -18,9 +18,13 @@
 #ifndef IDENTIFIERUTILS_CPP_WFFUZNET
 #define IDENTIFIERUTILS_CPP_WFFUZNET
 
+#include "IdentifierDatabase.h"
+
 #include <vector>
 #include <string>
 
+#include <boost/filesystem.hpp>
+
 namespace YouCompleteMe {
 
 // NOTE: this function accepts the text param by value on purpose; it internally
@@ -33,6 +37,9 @@ std::string RemoveIdentifierFreeText( std::string text );
 std::vector< std::string > ExtractIdentifiersFromText(
   const std::string &text );
 
+FiletypeIdentifierMap ExtractIdentifiersFromTagsFile(
+    const boost::filesystem::path &path_to_tag_file );
+
 } // namespace YouCompleteMe
 
 #endif /* end of include guard: IDENTIFIERUTILS_CPP_WFFUZNET */

+ 1 - 1
cpp/ycm/Utils.h

@@ -56,7 +56,7 @@ typename Container::mapped_type
 FindWithDefault( Container &container,
                  const Key &key,
                  const typename Container::mapped_type &value ) {
-  typename Container::iterator it = container.find( key );
+  typename Container::const_iterator it = container.find( key );
   return it != container.end() ? it->second : value;
 }
 

+ 6 - 11
cpp/ycm/tests/ClangCompleter/ClangCompleter_test.cpp

@@ -17,25 +17,23 @@
 
 #include "ClangCompleter.h"
 #include "CompletionData.h"
+#include "../TestUtils.h"
+
 #include <gtest/gtest.h>
 #include <gmock/gmock.h>
 
 #include <boost/filesystem.hpp>
-namespace fs = boost::filesystem;
+
+namespace YouCompleteMe {
 
 using ::testing::ElementsAre;
 using ::testing::WhenSorted;
 
-namespace YouCompleteMe {
-
 TEST( ClangCompleterTest, CandidatesForLocationInFile ) {
-  fs::path path_to_testdata = fs::current_path() / fs::path( "testdata" );
-  fs::path test_file = path_to_testdata / fs::path( "basic.cpp" );
-
   ClangCompleter completer;
   std::vector< CompletionData > completions =
     completer.CandidatesForLocationInFile(
-      test_file.string(),
+      PathToTestFile( "basic.cpp" ).string(),
       11,
       7,
       std::vector< UnsavedFile >(),
@@ -46,16 +44,13 @@ TEST( ClangCompleterTest, CandidatesForLocationInFile ) {
 
 
 TEST( ClangCompleterTest, CandidatesForQueryAndLocationInFileAsync ) {
-  fs::path path_to_testdata = fs::current_path() / fs::path( "testdata" );
-  fs::path test_file = path_to_testdata / fs::path( "basic.cpp" );
-
   ClangCompleter completer;
   completer.EnableThreading();
 
   Future< AsyncCompletions > completions_future =
     completer.CandidatesForQueryAndLocationInFileAsync(
       "",
-      test_file.string(),
+      PathToTestFile( "basic.cpp" ).string(),
       11,
       7,
       std::vector< UnsavedFile >(),

+ 13 - 0
cpp/ycm/tests/IdentifierCompleter_test.cpp

@@ -216,6 +216,19 @@ TEST( IdentifierCompleterTest, ShorterAndLowercaseWins ) {
                             "STDIN_FILENO" ) );
 }
 
+TEST( IdentifierCompleterTest, TagsEndToEndWorks ) {
+  IdentifierCompleter completer;
+  std::vector< std::string > tag_files;
+  tag_files.push_back( PathToTestFile( "basic.tags" ).string() );
+
+  completer.AddIdentifiersToDatabaseFromTagFiles( tag_files );
+
+  EXPECT_THAT( completer.CandidatesForQueryAndType( "fo", "cpp" ),
+               ElementsAre( "foosy",
+                            "fooaaa" ) );
+
+}
+
 // TODO: tests for filepath and filetype candidate storing
 
 } // namespace YouCompleteMe

+ 32 - 3
cpp/ycm/tests/IdentifierUtils_test.cpp

@@ -16,14 +16,19 @@
 // along with YouCompleteMe.  If not, see <http://www.gnu.org/licenses/>.
 
 #include "IdentifierUtils.h"
+#include "TestUtils.h"
+#include "IdentifierDatabase.h"
+
 #include <gtest/gtest.h>
 #include <gmock/gmock.h>
-
-using ::testing::ElementsAre;
-using ::testing::WhenSorted;
+#include <boost/filesystem.hpp>
 
 namespace YouCompleteMe {
 
+namespace fs = boost::filesystem;
+using ::testing::ElementsAre;
+using ::testing::ContainerEq;
+using ::testing::WhenSorted;
 
 TEST( IdentifierUtilsTest, RemoveIdentifierFreeTextWorks ) {
   EXPECT_STREQ( RemoveIdentifierFreeText(
@@ -127,5 +132,29 @@ TEST( IdentifierUtilsTest, ExtractIdentifiersFromTextWorks ) {
 
 }
 
+
+TEST( IdentifierUtilsTest, ExtractIdentifiersFromTagsFileWorks ) {
+  fs::path testfile = PathToTestFile( "basic.tags");
+  fs::path testfile_parent = testfile.parent_path();
+
+  FiletypeIdentifierMap expected;
+  expected[ "cpp" ][ ( testfile_parent / "foo" ).string() ]
+    .push_back( "i1" );
+  expected[ "cpp" ][ ( testfile_parent / "bar" ).string() ]
+    .push_back( "i1" );
+  expected[ "cpp" ][ ( testfile_parent / "foo" ).string() ]
+    .push_back( "foosy" );
+  expected[ "cpp" ][ ( testfile_parent / "bar" ).string() ]
+    .push_back( "fooaaa" );
+
+  expected[ "c" ][ "/foo/zoo" ].push_back( "Floo::goo" );
+  expected[ "c" ][ "/foo/goo maa" ].push_back( "!goo" );
+
+  expected[ "cs" ][ "/m_oo" ].push_back( "#bleh" );
+
+  EXPECT_THAT( ExtractIdentifiersFromTagsFile( testfile ),
+               ContainerEq( expected ) );
+}
+
 } // namespace YouCompleteMe
 

+ 7 - 0
cpp/ycm/tests/TestUtils.cpp

@@ -19,6 +19,8 @@
 
 namespace YouCompleteMe {
 
+namespace fs = boost::filesystem;
+
 std::vector< std::string > StringVector( const std::string &a,
                                          const std::string &b,
                                          const std::string &c,
@@ -58,5 +60,10 @@ std::vector< std::string > StringVector( const std::string &a,
   return string_vector;
 }
 
+boost::filesystem::path PathToTestFile( const std::string &filepath ) {
+  fs::path path_to_testdata = fs::current_path() / fs::path( "testdata" );
+  return path_to_testdata / fs::path( filepath );
+}
+
 } // namespace YouCompleteMe
 

+ 4 - 0
cpp/ycm/tests/TestUtils.h

@@ -21,6 +21,8 @@
 #include <vector>
 #include <string>
 
+#include <boost/filesystem.hpp>
+
 namespace YouCompleteMe {
 
 std::vector< std::string > StringVector( const std::string &a,
@@ -33,6 +35,8 @@ std::vector< std::string > StringVector( const std::string &a,
                                          const std::string &h = std::string(),
                                          const std::string &i = std::string() );
 
+boost::filesystem::path PathToTestFile( const std::string &filepath );
+
 } // namespace YouCompleteMe
 
 #endif /* end of include guard: TESTUTILS_H_G4RKMGUD */

+ 15 - 0
cpp/ycm/tests/testdata/basic.tags

@@ -0,0 +1,15 @@
+!_TAG_FILE_FORMAT	2	/extended format; --format=1 will not append ;" to lines/
+!_TAG_FILE_SORTED	2	/0=unsorted, 1=sorted, 2=foldcase/
+!_TAG_PROGRAM_AUTHOR	Darren Hiebert	/dhiebert@users.sourceforge.net/
+!_TAG_PROGRAM_NAME	Exuberant Ctags	//
+!_TAG_PROGRAM_URL	http://ctags.sourceforge.net	/official site/
+!_TAG_PROGRAM_VERSION	5.8	//
+i1	foo	junky;'junklanguage:C++
+i1	bar	junky;'junklanguage:C++
+foosy	foo	junky;"'junk	language:C++	zanzibar
+fooaaa	bar	junky;"'junk	language:C++	zanzibar
+bloa	foo	junky
+Floo::goo	/foo/zoo	language:C
+!goo	/foo/goo maa	language:C
+zoro	/foo	language:fakelang
+#bleh	/m_oo	123ntoh;;;\"eu	language:C#	;\"

+ 7 - 5
cpp/ycm/ycm_core.cpp

@@ -46,7 +46,7 @@ int YcmCoreVersion()
 {
   // We increment this every time when we want to force users to recompile
   // ycm_core.
-  return 3;
+  return 4;
 }
 
 
@@ -61,10 +61,12 @@ BOOST_PYTHON_MODULE(ycm_core)
 
   class_< IdentifierCompleter, boost::noncopyable >( "IdentifierCompleter" )
     .def( "EnableThreading", &IdentifierCompleter::EnableThreading )
-    .def( "AddCandidatesToDatabase",
-          &IdentifierCompleter::AddCandidatesToDatabase )
-    .def( "AddCandidatesToDatabaseFromBufferAsync",
-          &IdentifierCompleter::AddCandidatesToDatabaseFromBufferAsync )
+    .def( "AddIdentifiersToDatabase",
+          &IdentifierCompleter::AddIdentifiersToDatabase )
+    .def( "AddIdentifiersToDatabaseFromTagFilesAsync",
+          &IdentifierCompleter::AddIdentifiersToDatabaseFromTagFilesAsync )
+    .def( "AddIdentifiersToDatabaseFromBufferAsync",
+          &IdentifierCompleter::AddIdentifiersToDatabaseFromBufferAsync )
     .def( "CandidatesForQueryAndTypeAsync",
           &IdentifierCompleter::CandidatesForQueryAndTypeAsync );
 

+ 3 - 0
plugin/youcompleteme.vim

@@ -97,6 +97,9 @@ let g:ycm_complete_in_strings =
 let g:ycm_collect_identifiers_from_comments_and_strings =
       \ get( g:, 'ycm_collect_identifiers_from_comments_and_strings', 0 )
 
+let g:ycm_collect_identifiers_from_tags_files =
+      \ get( g:, 'ycm_collect_identifiers_from_tags_files', 1 )
+
 let g:ycm_autoclose_preview_window_after_completion =
       \ get( g:, 'ycm_autoclose_preview_window_after_completion', 0 )
 

+ 1 - 1
python/ycm/base.py

@@ -70,7 +70,7 @@ def CurrentIdentifierFinished():
     return line[ : current_column ].isspace()
 
 
-COMPATIBLE_WITH_CORE_VERSION = 3
+COMPATIBLE_WITH_CORE_VERSION = 4
 
 def CompatibleWithYcmCore():
   try:

+ 38 - 4
python/ycm/completers/all/identifier_completer.py

@@ -17,8 +17,10 @@
 # You should have received a copy of the GNU General Public License
 # along with YouCompleteMe.  If not, see <http://www.gnu.org/licenses/>.
 
+import os
 import vim
 import ycm_core
+from collections import defaultdict
 from ycm.completers.general_completer import GeneralCompleter
 from ycm import vimsupport
 from ycm import utils
@@ -33,6 +35,7 @@ class IdentifierCompleter( GeneralCompleter ):
     super( IdentifierCompleter, self ).__init__()
     self.completer = ycm_core.IdentifierCompleter()
     self.completer.EnableThreading()
+    self.tags_file_last_mtime = defaultdict( int )
 
 
   def ShouldUseNow( self, start_column ):
@@ -55,9 +58,9 @@ class IdentifierCompleter( GeneralCompleter ):
 
     vector = ycm_core.StringVec()
     vector.append( identifier )
-    self.completer.AddCandidatesToDatabase( vector,
-                                            filetype,
-                                            filepath )
+    self.completer.AddIdentifiersToDatabase( vector,
+                                             filetype,
+                                             filepath )
 
 
   def AddPreviousIdentifier( self ):
@@ -88,16 +91,47 @@ class IdentifierCompleter( GeneralCompleter ):
       return
 
     text = "\n".join( vim.current.buffer )
-    self.completer.AddCandidatesToDatabaseFromBufferAsync(
+    self.completer.AddIdentifiersToDatabaseFromBufferAsync(
       text,
       filetype,
       filepath,
       collect_from_comments_and_strings )
 
 
+  def AddIdentifiersFromTagFiles( self ):
+    tag_files = vim.eval( 'tagfiles()' )
+    current_working_directory = os.getcwd()
+    absolute_paths_to_tag_files = ycm_core.StringVec()
+    for tag_file in tag_files:
+      absolute_tag_file = os.path.join( current_working_directory,
+                                        tag_file )
+      try:
+        current_mtime = os.path.getmtime( absolute_tag_file )
+      except:
+        continue
+      last_mtime = self.tags_file_last_mtime[ absolute_tag_file ]
+
+      # We don't want to repeatedly process the same file over and over; we only
+      # process if it's changed since the last time we looked at it
+      if current_mtime <= last_mtime:
+        continue
+
+      self.tags_file_last_mtime[ absolute_tag_file ] = current_mtime
+      absolute_paths_to_tag_files.append( absolute_tag_file )
+
+    if not absolute_paths_to_tag_files:
+      return
+
+    self.completer.AddIdentifiersToDatabaseFromTagFilesAsync(
+      absolute_paths_to_tag_files )
+
+
   def OnFileReadyToParse( self ):
     self.AddBufferIdentifiers()
 
+    if vimsupport.GetBoolValue( 'g:ycm_collect_identifiers_from_tags_files' ):
+      self.AddIdentifiersFromTagFiles()
+
 
   def OnInsertLeave( self ):
     self.AddIdentifierUnderCursor()